


ATHEMATICAL 
JOURNAL 


EDITED BY 
LEONARD CARLITZ DAVID VERNON WIDDER 


JOSEPH MILLER THOMAS 
Managing Editor 


WITH THE COOPERATION OF 


1 . BOAS, JR. NATHAN JACOBSON A.J. MARIA J. A. SHOHAT 
BRAY C. G. LATIMER F. J. MURRAY R. J. WALKER 
J. GERGEN NORMAN LEVINSON 8. B. MYERS MORGAN WARD 
.HEDLUND E. J. McSHANE OYSTEIN ORE G. T. WHYBURN 
C. C. MacDUFFEE J. H. ROBERTS 


Volume 5, Number 2 
JUNE, 1939 


Corraiaar, 1939 


DUKE UNIVERSITY 
DURHAM, N. C. 





DUKE MATHEMATICAL JOURNAL 


This periodical is published quarterly under the auspices of Duke Universi 
by Duke University Press at Durham, North Carolina. It is printed at § 
Royal and Guilford Avenues, Baltimore, Maryland, by the Waverly Press. ~ 

Entered as second class matter at the Post Office, Durham, North Care 
Additional entry at the Post Office, Baltimore, Maryland. 

The subscription price for the current volume is four dollars, postpaid; } 
volumes, five dollars each, carriage extra; single numbers, one dollar and tw 
five cents, postpaid. Individual members of the Mathematical Associa i 
America may subscribe at half price; mention of such membership show 
made when subscribing. Subscriptions, orders for back numbers, and notieg 
change of address should be sent to Duke University Press, Durham, Now 
Carolina. @ 

Manuscripts and editorial correspondence should be addressed to Dy 
Mathematical Journal, 4785 Duke Station, Durham, North Carolina. 4 
Author’s Manual containing detailed information about the preparation) 
papers for publication will be sent on request. 

Manuscripts submitted for consideration by the editors should be 
written, with the exception of the formulas, which may be filled in by hi 
It is recommended, though not required, that footnotes be numbered r 
secutively with Arabic numerals, and that they be placed at the end of 
manuscript. A footnote will appear, however, at the bottom of the prim 
page of the journal on which reference is made to it. The author should kee 
duplicate copy of the manuscript as a precaution against loss in transit and 
aid in reading proof. The original manuscript will not accompany the ga 
proof sent to the author unless requested. Such a request may delay | 
appearance of the article. Drawings must be made in black India ink on B 
board and fully lettered. 

Authors are entitled to one hundred free reprints. Additional copies 
furnished at cost. All reprints will be furnished with covers unless the conti 


is specifically requested. 


4 
~ 


The American Mathematical Society is officially represented on the Editi 
Board by Professors Ore and Whyburn. : 


Made in United States of America 


WAVERLY PRESS, INC. 
BALTIMORE, U. 8. A. 








o 





SURFACES OF NEGATIVE CURVATURE AND PERMANENT 
REGIONAL TRANSITIVITY 


By ANNA GRANT 


1. Introduction. The various problems connected with transitivity have 
been treated extensively for the flows defined by the geodesics on two-dimen- 
sional manifolds of negative curvature. A description of the extent to which 
solutions of the problems have been attained has been given by Hedlund [7].’ 

The manifolds in question can be obtained by identifying the points congruent 
under a Fuchsian group. The present paper shows that if the Fuchsian group 
is of the first kind and the manifold is of negative curvature, the property of per- 
manent regional transitivity holds. That is, the geodesics define a flow in the 
space of elements such that if O is any open set of elements at time t , 0; is 
the image of O after time ¢, and O* is any other open set of elements, there exists 
afsuch that for |t| > ¢ the set O,-O* is not empty. It is thus an extension 
of a similar result obtained by Hedlund [6] in the case of constant negative 
curvature. The extension requires the derivation of numerous geometric 
results which should be useful in the further study of the geodesic flows on the 
surfaces under consideration. 


2. A class of simply-connected two-dimensional manifolds. Let l’ denote 
the unit circle u* + v° = 1, and let © be its interior, with the following metric 
defined in V: 

(2.1) ds’ = d(u, v) (du + dv’) 
(1 — u? — v*)? 


S Au, v) of class C”, m = 5, and 0 <a Xu, v) S bin Vv. The length of any 


curve segment of class C’ in W is / ds evaluated over the curve, ds given by 


(2.1). The geodesics defined by (2.1) are of at least class C* in are length, co- 
ordinates of initial point, and initial direction. The term geodesic will refer 
to the geodesics defined by (2.1). Given a point in ¥ and direction at this 
point, there is a unique geodesic passing through the given point in the given 
direction. 

If \(u, v) = 2 in W, the geodesics are ares of circles orthogonal to U’ and are 
called hyperbolic lines. Given any two points P and Q in ¥, there is a unique 


hyperbolic line segment joining them; and / ds evaluated over this segment, 


Received August 24, 1938; in revised form, April 8, 1939. The writer is greatly in- 
debted to Gustav A. Hedlund for suggestions and help in the preparation of this paper. 
! Numbers in brackets refer to the bibliography at the end of the paper. 

207 











208 ANNA GRANT 


where ds is given by (2.1) with A(u, v) = 2, is called the hyperbolic distance 
between P and Q. 

A geodesic segment joining points P and Q of ¥ is of class A (Morse [9], p. 40) 
if its length is not greater than that of any other rectifiable curve joining P 
and Q. Given two points P and Q of W, there exists a class A geodesic segment 
joining these points (Hedlund [5], p. 535). The distance between P and Q will 
be defined as the length of a class A geodesic segment joining P and Q. It will 
be denoted simply by PQ. It is easily shown that this distance satisfies the 
usual properties of a metric. Also, Hedlund ((5], p. 536) noted that the fol- 
lowing fundamental theorem of Morse ((9], p. 41) holds in the case under con- 
sideration. 


I. There exists a positive constant D determined by d(u, v) of (2.1) such that in ¥ 
no class A geodesic segment can recede a hyperbolic distance greater than D from 
the hyperbolic line segment joining its end points. 


Any geodesic ray in ¥ can be continued to infinite length, or the geodesics 
in V are unending. This can be proved directly, but it is a simple consequence 
of a result of Hopf and Rinow (cf. Hopf-Rinow [8], p. 215) which states that the 
above condition is equivalent to the condition that each bounded set in WV be 
compact, where the metric in terms of which bounded is defined is that already 
given by distance. A set in Y which is bounded in the sense of the hyperbolic 
metric is evidently compact in the sense of the hyperbolic metric. But the 
distance lies between two constant multiples of the hyperbolic distance (Morse 
[9], p. 35; Hedlund [5], p. 535), so that a set which is bounded or compact in 
terms of distance is bounded or compact, respectively, in terms of hyperbolic 
distance, and conversely. Thus a set which is bounded in terms of distance 
is bounded in terms of hyperbolic distance. But then it is compact in the 
sense of hyperbolic distance and thus compact in the sense of distance. This 
proves the desired result. 

Let an unending geodesic be of class A (Morse [9], p. 40) if every finite seg- 
ment of it is of class A, and let two unending curves be of the same type (Morse 
[9], p. 42) if there exists a constant 6 such that any point of either is at a hyper- 
bolic distance less than 6 from some point of the other. Then the following 
results, essentially Morse’s, hold. 


II. (Morse [9], p. 44.) Corresponding to any hyperbolic line, there exists at 
least one unending geodesic of class A of the same type. Conversely, each unending 
geodesic of class A is of the type of some hyperbolic line. 

III. (Morse [10], p. 54.) Corresponding to each hyperbolic ray issuing from 
a point P in WV, there exists a geodesic ray of class A with P as initial point and of 
the same type as the hyperbolic ray. Conversely, each class A geodesic ray issuing 
from P is of the type of some hyperbolic ray issuing from P. There exists a con- 
stant u, determined by X(u, v) of (2.1), such that the type distance between two class A 
geodesic rays of the same type and issuing from the same point, or between a class A 
geodesic ray and a hyperbolic ray of the same type and issuing from the same point, 
never exceeds yu. 














SURFACES OF NEGATIVE CURVATURE AND TRANSITIVITY 209 


3. The above manifold with negative curvature imposed. Since in the 
metric (2.1) E = G, F = 0, the formula for Gaussian curvature K reduces to 


K(u, 0) = gi (Bu + Et — EEw — EEw) 


E 


4 ] 2 2\2/. 2 2 
= ~ 2 + T (1 a we v) (ri, + r, — AAuu —- Avy). 


It will be assumed that A(u, v) is such that 
(3.1) K(u,») <0, w+ov<1. 


It may be noted that this condition is satisfied if \(u, v) is nearly constant in 
value. 

It follows from the Gauss-Bonnet formula that the sum of the angles of a 
geodesic triangle is less than 7 and that no two geodesics can intersect twice. 
Given two points P and Q of ¥, there is just one geodesic segment joining 
P and Q. This will be denoted by gs(P, Q). The distance PQ is the length 
of gs(P, Q) and is a continuous function of the coérdinates of P and Q (cf. Bieber- 
bach [1], p. 299). The unique geodesic on which P and Q lie will be denoted 
by g(P, Q). 

Since no two geodesics can intersect twice, all geodesics are of class A, and 
it follows from II, §2, that each unending geodesic g is of the type of some 
hyperbolic line h. The points A and B in which h meets U’ will be called the 
points at infinity of g. The points in W at hyperbolic distance less than or equal 
to D from h lie in a region bounded by two circular arcs with end points A and B. 
It follows from I, §2, that g must lie in this region; and since g can have no 
multiple points, g, together with its points at infinity A and B, forms a simple 
continuous curve joining A and B. Thus g divides V into two parts. 

Similarly, from III, §2, a geodesic ray with initial point P is of the type of a 
hyperbolic ray with initial point P. The point in which the hyperbolic ray 
meets U’ will be called the point at infinity of the geodesic ray. 

The geodesics through an arbitrary point P of ¥ form a field in V except at P 
and thus geodesic polar coérdinates which extend throughout VW can be set up 
with P as center. The element of distance has the form 


( ds’ = dr” + M*(r, ¢) de’, 
(3.2) 
M,.(r, ¢) = —K(r, ¢)-M(r, ¢), M,(0, ¢) =1, 


where K(r, ¢) is the Gaussian curvature. Since K(r, ¢) < 0, it follows readily 
that 


(3.3) M(r, ¢) > r. 


Let P be a point of Y and let g, with points at infinity A and B, be a geodesic 
not passing through P. Let s be the are length on g measured from a point Qo 
and positive in the direction of B. If Q(s) is the point of g with codrdinate s, 
let @(s) be the angle between the directed geodesic segment with initial point 








210 ANNA GRANT 


Q(s) and terminal point P and the geodesic ray with initial point Q(s) and point 
at infinity A. Then the angle @(s) is a continuous, decreasing function of s 
such that lim 6(s) = rand lim @(s) = 0. For assuming As > 0 and applying 


3s——o2 3 —++20 


the Gauss-Bonnet formula to the geodesic triangle with vertices P, Q(s) and 
Q(s + As), we obtain 


(3.4) 6(s) = 6(s + As) + a(s, As) — [ Kas, 


where a(s, As) is the angle at P in this triangle. Since K < 0, 0(s) > 6(s + As) 
and @(s) is a decreasing function. Since the last two terms on the right in (3.4) 
approach zero with As, @(s) is continuous on the right. Similarly, it can be 
shown that it is continuous on the left. Since 6(s) is a decreasing positive func- 
tion, lim @(s) exists and let us suppose that it is not zero. With Q(s) as center 


set up geodesic polar coérdinates. For any particular value of s, the length 
of the geodesic segment gs(P, Qo) will equal 
. 6(s O(s 

[ (+ Me Mdy = | Wie 

“0 0 
and 6(s) is bounded away from zero. But by choosing s sufficiently large, it 
follows from (3.3) that M can be made arbitrarily large, and thus the length 
PQ would not be bounded. We infer that lim 6(s) = 0. A similar argument 


$—>+00 


shows that lim @(s) = zx. 


s——0 
If g is an arbitrary geodesic and P is an arbitrary point of ¥, it follows from 
the preceding results that there exists a unique geodesic through P which meets 
g at right angles. It is called the geodesic through P normal to g, and the 
length of the segment of it from P to the intersection with g is the normal dis- 
tance. From the Gauss-Bonnet formula, two geodesics which are both normal 
to a given geodesic g cannot intersect and thus the geodesics normal to a given 
geodesic form a field in ¥. We can now set up geodesic normal coordinates 
using g as base, are length along it as coérdinate z, and arc length along the 
geodesics perpendicular to it as coérdinate y. The element of distance assumes 
the form 
(3.5) ds’ = M*(zx, y) dx’ + dy’. 


If P is any point not on g, and yp is the normal distance from P to g, any 
geodesic segment from P to g has length 1 given by 


th ti 
l= / (7? + M?.a*)'dt = / (y’)' dt = yp, 
to to 
and thus the normal distance from P to g is the least geodesic distance from 


P to g. Let Q be the point of g such that the geodesic segment gs(P, Q) is 
normal to g, and let s be the length on g measured from Q. It can be shown 











— 
n 





SURFACES OF NEGATIVE CURVATURE AND TRANSITIVITY 211 


that the distance d(s) from the point P to the variable point R(s) of g is a con- 
tinuous function of s which is increasing for s > 0 and decreasing for s < 0 
such that 

lim d(s) = +0. 

sto 

With regard to the distance between two geodesics g and g’, it is known 
(Hadamard [4], p. 55) that if s is the are length measured on g, the signed normal 
distance n(s) to g’ from the point P(s) of g is continuous, — » <s < +o, and 
varies according to one of the following conditions. 

1. The geodesics g and g’ intersect and n(s) either increases from — « to + x 
or decreases from +2 to — ~. 

2. The geodesics g and g’ do not intersect, n(s) does not change sign and 
|n(s) | either increases from 0 to + ~ or decreases from + to0. In this case 
g and g’ are said to be asymptotic. 

3. The geodesics g and g’ do not intersect, n(s) does not change sign, and 
|n(s) | decreases from + to a positive minimum and then increases to + ~. 

These cases depend only on g and g’ and remain the same if the rdéles of g 
and g’ in the definition of n(s) are interchanged. 

Since g (g’) is of the type of a hyperbolic line h (h’), the normal distance 
between h and h’ will display the same properties with respect to cases 1, 2, or 3 
as does the normal distance between g and g’. Thus case 1 occurs if the points 
at infinity of g and g’ are all distinct and separate each other on LU’; case 2 occurs 
if g and g’ have just one common point at infinity; case 3 occurs if the points at 
infinity of g and g’ are all distinct and do not separate each other on U. In 
particular we conclude that g and g’ cannot have the same points at infinity 
and therefore there is just one geodesic of the type of a given hyperbolic line. 
The unique geodesic with points at infinity A and B will be denoted by g(A, B). 

It follows from I and III, §2, that two geodesic rays issuing from the same 
point P of YW must have distinct points at infinity. The unique geodesic ray 
with initial point P and point at infinity A will be denoted by gr(P, A). 

Given A on U and P in ¥, there exists a geodesic passing through P and 
having A as one of its points at infinity. Since no two geodesics with A as 
point at infinity can intersect in WV, the geodesics with a common point at in- 
finity form a field in W. 

The distance between two points of WV has been defined. A sequence of points 
in VW approaches a point P of W if the distance from P of the points of the set 
approaches zero. The distance between two points of U’ will be defined as 
Euclidean distance, and a point varies continuously on U if its distance from a 
fixed point of U varies continuously. A sequence of points within or on U 
shall be said to approach a point A of U as limit point if the Euclidean distance 
of the points of the sequence from A approaches zero. 

An element is a point of ¥, P(u, v), and a direction ¢ at this point, where it 
can be assumed that ¢ is measured from the direction parallel to the positive 
u-axis. An element thus has three coérdinates and will be denoted by p(w, », ¢). 














212 ANNA GRANT 


The point P(u, v) will be said to be the point bearing p(u, v, ¢). The distance 
between the elements p(w , 1: , ¢1) and q(ue, v2, ¢2) will be defined as the sum 
of the distance between (wu; , v1) and (ue, v2) and || g2 — ¢: || where this denotes 
the least numerical value of the set |g. — ¢; + 2n7z | (n = 0, +1,---). Con- 
tinuity and limit point in the space of elements are understood to be defined 
in terms of this metric. 

The element e(u, v, ¢) determines a unique geodesic ray, namely, that one 
with initial element e. If A is the point at infinity of this ray, it can be shown 
that, as a consequence of III, §2, and the continuous variation of a geodesic 
with continuous variation of the initial conditions, A varies continuously on U 
as e varies continuously. Also, if P and A vary continuously, the direction 
of gr(P, A) at P varies continuously, and if Q in ¥ approaches A on U’, the 
direction of gs(P, Q) at P approaches that of gr(P, A) at P. 

Let g be a geodesic with points at infinity A and B and let C be a point of 
U distinct from A and B. Let s be the are length on g measured from a point 
@ on g and positive in the direction of B. If Q(s) is the point of g with co- 
ordinate s, let 6(s) be the angle at Q(s) between the directed geodesic rays 
gr(Q(s), C) and gr(Q(s), A). Then 6(s) is a continuous monotonic decreasing 
function of s such that lim 6(s) = z, lim 6(s) = 0. For by choosing the point P 


3—>—o s—+e 


sufficiently close to C, the angle g(s) at Q(s) = Q between gs(Q, P) and gr(Q, A) 
can be made arbitrarily close to @(s), and the angle g(s + As) at Q(s + As) = Q’ 
between gs(Q’, P) and gr(Q’, A) can be made arbitrarily close to 6(s + As). 
But it has been shown that ¢(s) is a decreasing function of s and it follows that 
6(s) must be monotonic decreasing. That @(s) is continuous follows from the 
fact that as As approaches zero, Q(s + As) approaches Q(s), and the initial 
element of gr(Q(s + As), C) approaches the initial element of gr(Q(s), C). To 
show that lim 6(s) = 0, let R be a fixed point of g(B,C). Then 6(s) is less than 


the angle a(s) = RQ(s)A. As s becomes positively infinite, a(s) approaches 
zero and consequently @(s) must approach zero. A similar argument shows 
that lim @(s) = zx. 


8-2 


4. Geodesic circles. If geodesic polar codrdinates are set up with any point 
Q of ¥ as base, the curve r = constant will be called a geodesic circle with cen- 
ter Q. For brevity the term circle will mean geodesic circle except in the case 
of the unit circle U. A circle is a simple closed curve of class C’ and is per- 
pendicular at a point of it to the geodesic determined by the point and the 
center of the circle. 

THEOREM 4.1. A circle and a geodesic intersect in two points, or are tangent 
and have one point in common, or have no points in common. 

This is an immediate consequence of the way in which the distance from a 
fixed point to a point of a geodesic varies. 











SURFACES OF NEGATIVE CURVATURE AND TRANSITIVITY 213 


A domain of ¥ is conver if, P and Q being arbitrary points of 6, the geodesic 
segment gs(P, Q) lies entirely in ®. 


THEOREM 4.2. A circle bounds a convex domain. 


For if P and Q are interior points of the circle C, and A and B are points at 
infinity of g(P, Q) such that the order of points on g(P, Q) is APQB, then gr(P, A) 
and gr(Q, B) must both intersect C. It follows from Theorem 4.1 that the 
segment gs(P, Q) can have no point on C and thus all of its points must be 
interior to C. 


THEOREM 4.3. If P and Q are points of the circle C, the points of gs(P, Q) 
other than P and Q are all interior points of C, the points of g(P, Q) not on gs(P, Q) 
are all exterior points of C. 


For the points of gs(P, Q) other than P and Q are all nearer the center of C 
than either P or Q. The points of g(P, Q) not on gs(P, Q) are all farther away 
from the center of C than either P or Q. 


THEOREM 4.4. T'wo circles have no points in common, or are tangent and have 
one common point, or intersect in two points. In the last case the two points of 
intersection lie on opposite sides of the geodesic determined by the centers. 


Two circles with the same center and different radii have no common point. 
If two circles are tangent at P, it is readily shown that, except for P, C’ lies 
either entirely exterior or entirely interior to C. That two circles which have 
only one common point must be tangent is clear from the fact that if they are 
not tangent, each must have on it points interior and exterior to the other. 
Thus if C and C’ are the two circles and P and Q are points of C, respectively 
interior and exterior to C’, the two ares of C determined by P and Q must inter- 
sect C’ in distinct points. 

It remains to show that two circles can intersect in at most two points. Sup- 
pose that C and C’ intersect in three points, P,, Pz, and P;. The centers 
Q and Q’ of C and C’, respectively, cannot coincide. No one of the points 
P,, P2, Ps can lie on g(Q, Q’); for, if it did, the triangle property would exclude 
the existence of any other common point. Therefore two of them, say P; 
and P,, lie on the same side of g(Q, Q’) but not on it. Since P; and Py, are 
assumed distinct, g(Q, Pi) and g(Q, P2) can have only Qin common. Similarly, 
g(Q’, P:) and g(Q’, Pz) can have only Q’ in common. It can be assumed that 
the notation is so chosen that P; is exterior to the geodesic triangle QQ’P: . 
Then gs(Q, P2) cannot cut across gs(Q’, P,); for, if there were such a point of 
intersection R, it would follow that 


P.R > Q’P, = QR | = | Q’P, — Q’R | = P,R, 
as well as 


P:R > | QP; — QR| = | QP: — QR| = P2R, 











214 ANNA GRANT 


and both of these cannot hold. Therefore the geodesic triangle QP,Q’, which 
constitutes a simple Jordan curve, would contain P» in its interior and g(P; , P2), 
which cannot intersect g(Q, P:) or g(Q’, Pi) except at P; , would necessarily cut 
gs(Q, Q’) in some point S. Since QQ’ < QP, + Q’P,, the sum of the radii of 
C and C’, S would necessarily lie within one of the circles and Theorem 4.3 
would be contradicted. 

The proof of the theorem is complete. 


THEorEM 4.5. If two circles intersect in one or two points, the geodesic segment 
determined by their centers can have no point exterior to both circles. 

If the circles C and C’ have centers Q and Q’ and radii r and r’, respectively, 
the condition that they have a common point implies that r + r’ = QQ’. But 
if S is any point of gs(Q, Q’), OS + Q’S = QQ’ = r +7’, and either QS <S ror 
Q’S <r’. This implies the stated result. 

5. Equidistant curves. In the case of constant negative curvature the locus 
of points equidistant from two points P and Q of W is the hyperbolic line which 
bisects perpendicularly gs(P, Q). In the case of variable negative curvature 
this locus is not necessarily a geodesic, but it can be shown that it has some 
of the properties of a geodesic as far as behavior in the large is concerned. 


TuHeorEM 5.1. The locus of points equidistant from two given points of V is a 
continuous unending curve which is the topological image of a line and is of the 
type of a hyperbolic line. 

Let Q and Q’ be the two given points and let 2d = QQ’. If P is an arbitrary 
point of W, let r denote its distance from Q, r’ its distance from Q’. Then every 
point of Y has a coérdinate pair (r, 7’). If r < d, the geodesic circles with 
centers Q and Q’ and radius r have no point in common, and thus no point of ¥ 
has codrdinates (r, r) if r < d. The midpoint of gs(Q, Q’) has the coérdinates 
(d, d), and it is evidently the only point of WY with these coédrdinates. If r > d, 
the circle C, with center Q and radius r intersects g(Q, Q’) in two points of 
which one is at distance |r — 2d | and the other at distance r + 2d from . 
Thus there are points of C, interior and exterior to the circle C; with center Q 
and radius r, and C, intersects C;. It follows from Theorem 4.4 that C, inter- 
sects C; in just two points and these are on opposite sides of g(Q, Q’). Let o 
denote the closed region bounded by g(Q, Q’) and one of the arcs of U’ determined 
by the points at infinity of g(Q, Q’). Then corresponding to r = d, there is 
just one point of o with the codrdinates (r, r) and of these only (d, d) is on the 
boundary of ¢. It is sufficient to show that the set R(r, r), r = d, of o is the 
topological image of a ray and is of the type of a hyperbolic ray. 

It has been shown already that the set R is in one-to-one correspondence 
with the set of values of r, r 2 d. To show that this correspondence is con- 
tinuous, let P(r, r) and P’(r + Ar, r + Ar) be points of «. If we did not have 


lim P’ = P, because of the compactness of the space there would exist a point P*, 
Ar--0 





al 











SURFACES OF NEGATIVE CURVATURE AND TRANSITIVITY 215 


different from P, in o and with coérdinates (r, r). But this is impossible. 
Conversely, nearby points in VW have coérdinate pairs differing only slightly 
and the correspondence between the set R and the set r 2 d is topological. 

It remains to show that the set R is of the type of a hyperbolic ray. To that 
end, let T with codrdinates (t, t), t > d, be a point of o, let g(Q, Q’) have points 
at infinity A and A’ such that the order on g(Q, Q’) is AQQ’A’, and let B, and 
B; be the points at infinity of gr(Q, T) and gr(Q’, T), respectively. The geodesic 
g(Q, Q’) and the rays gr(Q, B,) and gr(Q’, B;) divide the part of V in o into four 
parts which can conveniently be denoted by the vertices on their boundaries. 
If X is any point, other than 7, of A’Q’TB,, gs(Q, X) must cut gs(Q’, T) in 
some point Y and 

QY = Q’T — YT = QT —- YT < QY. 
Thus 
Q’X <QY+ YX < QY + YX = QX, 


and no point, other than 7’, of A’Q’TB, can be a point of R. <A similar state- 
ment holds for AQTB; 4 

If X is any point, other than 7, of QQ’T, gr(Q, X) must cut gs(Q’, 7) in some 
point Z and 


QX+Q'X <QNXN +XZ4+QZ = 024+0Z2<Q7T4+27T74+Q2Z 

= QT + Q’T = 2QT, 
and at least one of the lengths QX and Q’X must be less than Q7 = ¢t. It fol- 
lows that the points (7, 7) of R which lie in QQ’/T must have r < ¢. If we com- 


bine these results, the points (r,7) of R for which r > ¢ must lie in the interior of 
TB;B,. As t increases, the points B, and Bi move toward each other on U 


and each must approach a limiting point. If lim B, = B, then lim T = B, 
t—oo t—-2 
and since T lies on gr(Q’, B;), it follows that lim B; = B. 
t—>0 


The set R, except for (d, d), lies in the interior of the region bounded by 
gs(Q, Q’) and the geodesic rays gr(Q, B) and gr(Q’, B). For all points of R lie 
in the region bounded by gs(Q, Q’), gr(Q, B:), gr(Q’, B.), and the arc B/B, of U, 
no matter what the value of ¢. Since as ¢ becomes infinite gr(Q, B;) approaches 
gr(@, B) and gr(Q’, B,) approaches gr(Q’, B), the set R must lie either in or on 
the boundary of the stated region. The point (d, d) is the only point of R on 
9s(Q, Q’). Suppose there were a point T(t, t) of R on either gr(Q, B) or gr(Q’, B). 
This would imply that either B, or B; must coincide with B. But as ¢ increases, 
both of the points B, and B; move, and neither can coincide with B. The set R 
satisfies the stated condition. 

It follows that the set R is of the type of gr(Q, B), or equally well gr(Q’, B), 
and the proof of the theorem is complete. 

The locus of points equidistant from the points P and Q of W will be denoted 








216 ANNA GRANT 


by E(P, Q) and the points at infinity of the hyperbolic line which is of the type 
of E(P, Q) will be called the points at infinity of E(P, Q). 

The following theorem will be useful. 

TueoreM 5.2. Let Q and Q’ be points of VY and D a point of U not identical 
with either of the points at infinity of E(Q, Q’). Then there exists a neighborhood 
of D and a positive constant 6 such that if P is any point of V in this neighborhood, 


| PQ — PQ’| > 6. 


Let A and A’ be the points at infinity of g(Q, Q’) such that the order of points 
is AQQ’A’. If D coincides with A’, a neighborhood of D can be chosen so that 
for every P of © in this neighborhood, gs(Q, P) has on it a point R within dis- 
tance 4d of Q’. But then 


PQ — PQ’ = PR + RQ — PQ’ > jd — 3d = 3d. 


A similar proof applies if D coincides with A. 

If we assume that D is neither A nor A’, D lies in the interior of one of the four 
intervals of U’ determined by A, A’, and the points at infinity B, B’, of E(Q, Q’). 
The proofs are similar in the four cases and it can be assumed that D lies in 
A’B. Then gr(Q, D) cuts across gr(Q’, B) in some point M. If the closed 
subinterval ¢ of A’B with midpoint D is chosen sufficiently small, all the geodesic 
rays with initial point Q and point at infinity in ¢ will intersect gr(Q’, B) in 
points of a closed interval y. No point of R is in y, and by use of the method 
of proof of Theorem 5.1 it is readily shown that if 7’ is a point of y, QT > Q’T. 
Since y is closed and QT — Q’T varies continuously as 7 varies continuously, 
there exists a 6 > 0 such that for Tiny, QT — Q’T > 6. Let the neighborhood 
of D be chosen so that if P is a point of ¥ in it, P lies in the region bounded by ¢ 
and the geodesic with points at infinity at the ends of ¢. For any such P, 
gs(Q, P) cuts across y in a point 7 and 


QP — OUP = QT + TP —- QP > QT —- QT > 4. 


The stated theorem is proved. 


6. Horocycles. Existence and some geometric properties.’ In the case of 
constant negative curvature, a horocycle is a Euclidean circle tangent to U 
at some point A and is an orthogonal trajectory of the field of hyperbolic lines 
with A as point at infinity. The first of these characterizations evidently 
cannot be extended to the case of variable negative curvature and the second 
is not a convenient point of departure, though the geodesics with A as one 
point at infinity do form a field in ¥. However, in the case of constant negative 
curvature, a horocycle is the limit of hyperbolic circles passing through a fixed 
point of W as the centers approach a point A of U. The analogue of this will 

? The author is indebted to the referee for the proofs of the lemmas and theorems of this 
section. The original proofs did not make use of convex sets and were considerably 


longer. 





a 2p ae + 





e of 
»U 
ines 
ntly 
‘ond 


one 
tive 
ixed 
will 
this 
ably 





SURFACES OF NEGATIVE CURVATURE AND TRANSITIVITY 217 


be used to define the generalized horocycles which will be found to possess many 
of the properties of the (hyperbolic) horocycles. 

Let N,(P) be the r-neighborhood of P, that is, the totality of points at dis- 
tance less than r from P. The boundary of N,(P) is a circle. For any P of ¥ 
and A of U, let P(s) denote the point of gr(P, A) at distance s (s = 0) from P. 
Let the horocyclic region C(P, A) be defined (cf. Busemann [3], pp. 144-145) 
as the point set sum >> N.(P(s)). 

s>0 


THEOREM 6.1. The set C(P, A) is an open convex set. 


Since C(P, A) is the sum of open sets, it is open. If Q, and Q. are any two 
points of C(P, A), it follows from the definition of C(P, A) that there exist 
positive numbers s; and sz such that Q; is in N,,(P(s,)) (¢ = 1, 2). With the 
aid of the triangle inequality we can easily show that if s; = s2, then N,,(P(s:)) 
> N,,(P(s2)), and hence if s is the larger of s,; and s2, both points Q; and Q, 
are in N,(P(s)). But N,(P(s)), s > 0, lies in C(P, A) and is convex. This 


completes the proof of the theorem. 


Lemma 6.1. Jf gr(P, A) and gr(P, B) are orthogonal at P, and 6(s) is the 
acute angle between gr(P(s), B) and gs(P(s), P) (s > 0), then for every point Q 
on gr(P, B) 


82 


QP(s2) — QP(s:) = [ cos 6(s) ds > 0, 


41 


whenever 0 S 8 < 8. 


For any point Q on gr(P, B) let ¢(s) be the angle between gs(P(s), Q) and 
gs(P(s), P). As Q tends to B along gr(P, B), ¢(s) increases and approaches 
(s) < 4x. It follows that (cf. Bliss [2], p. 100) 


£ QP(s) = cos ¢(s) > cos A(s), s>0, 


whence by integration the lemma holds for 0 < s; < s:. By continuity it 
holds if s, = 0. 


THEOREM 6.2. In geodesic coérdinates (x, y) with gr(P, A) as positive x-azis, 
the horocyclic region C(P, A) is defined by inequalities y (x) < y < y*(x), x > 0, 
and the boundary of C(P, A) consists of the two arcs y = y"(x) and y = y (xz), 
where y' (x) and y (x) are continuous functions defined for x = 0 and y*(0) = 
y (0) = 0. 

All points (%, y),  < 0, are exterior points of N.(P(z)) = Na, 0), x > 0, 
%0 that any such point is not a point of the set C(P, A). All points P(z), 
t> 0, are in C(P, A), but P is not, so P is on the boundary of C(P, A) which 
will be denoted by C(P, A). If the point (0, y), y # 0, is at distance d,, from 
(zr, 0), it follows from Lemma 6.1 that there exists a positive constant 6(y) 
such that d,, — x = 6(y) > 0. Hence the 4(y)-neighborhood of (0, y) has no 











218 ANNA GRANT 


points in N,(P(x)) and the distance from (0, y) to the set C(P, A) is at least 
4(y). Thus, for every point of C(P, A), x > 0, and the only point of C(P, A) 
with z = 0 is P itself. 

Choose any 2% > 0 and let B and C be the points at infinity of the geodesic 
xz = 29. The point (zo, 0) is in C(P, A) and since C(P, A) is convex, it follows 
that the part of z = x in C(P, A) is an open segment or an open ray or an 
entire geodesic. Let Xo be the point (xz, 0) and let Y(y), y > 0, denote the 
point (7%, y), where we assume the notation has been so chosen that Y(y) 
approaches B as y becomes positively infinite. If @(y) denotes the acute angle 
between gs(Y(y), Xo) and gr(Y(y), A), lim @(y) = 0 and there exists a y > 0 

yr+o 


such that 
Yo 
[ cos 6(y) dy > PX + 1. 
0 
Now if Q is any point of gr(X>, A), it follows from Lemma 6.1 that 
Yo 
QY (yo) = QXo + | cos Hy) dy = QP +1. 
0 


Thus the distance from Y(yo) to the set C(P, A) is at least 1 and not all of 
gr(Xo, B) isin C(P, A). Similarly, not all of gr(Xo, C) is in C(P, A) and the 
part of g(B, C) in C(P, A) is a finite open segment which may be denoted by 
y (to) < y < y' (2). 

It is easily shown that the convexity of the set C(P, A) and the finiteness 
of y (x) and y*(x) for each x 2 6 imply the continuity of these functions. 
This completes the proof of the theorem. 

The set C(P, A) of points (x, y"(x)) and (x, y (x)) will be called the horocycle 
determined by P and A. As & becomes positively infinite, the points at infinity 
of the geodesic s = # both move toward A. They must both approach A as 
limit point for, if either one did not, x = # could not remain perpendicular to 
gr(P, A). It follows that the whole of the geodesic x = # shrinks to A as # 
becomes infinite and consequently lim (2, y*(x)) = A and lim (z, y“(x)) = A. 


zt— +o 


The point A will be called the point at infinity of C(P, A). The horocycle 
C(P, A) together with A forms a simple closed curve of which C(P, A) is the 
interior. 

The set N,(P), r > 0, has been defined as the set of points at distance less 
than r from P. With the understanding that N,(P) is empty if r < 0, we can 
state the following lemma. 


Lemma 6.2. If p(s) is a non-negative function defined for s > 0 and 
if lim p(s) = 0, then 


s—>+00 


> N5~te)(P(s)) = C(P, A). 


Since Ns_,:s)(P(s)) C N.(P(s)) C €(P, A), it follows that D> Nuc) (P(s)) € 
s>0 
C(P, A). Conversely, let Q be any point of C(P, A). From the definition of 





yc 


n of 











SURFACES OF NEGATIVE CURVATURE AND TRANSITIVITY 219 


C(P, A) we can infer the existence of s > 0 such that P C N,,(P(so)), and 

since this is an open set, there exists a 6 > 0 such that N,(Q) C N,,(P(so)). 

If s is so chosen that s > soand p(s) < 6, we have N,(P(s)) D N,,(P(s0)) > Ns(Q) 

and hence Q lies in N.4(P(s)) C Ns-(P(s)) C 2X, Nene (P(8))- Thus 
s> 


((P, A) C pb N—,¢s)(P(s)), and the proof is complete. 
s>0 


Lemma 6.3. If Q is on C(P, A), then C(Q, A) = C(P, A). 

Let P(s) be the point on gr(P, A) with PP(s) = s, and let Q(c) be the point 
on gr(Q, A) with QQ(c) = oc. It will first be shown that p(s) = P(s)Q(s) - 0 
ass ®. 

Since the two geodesic rays are asymptotic, for every 6 > 0 there is an s(6) 
such that if s = s(6) there is a point Q’(s) on gr(Q, A) with P(s)Q’(s) < 4. 
Also, since Q is on C(P, A), s(é) can be taken so large that for all s = (6), 
0 < QP(s) — s < 6 But | QQ’(s) — QP(s)| < P(s)Q’(s) < 4, so 
s — QQ’(s)| < 26. By definition of Q(s), this gives Q(s)Q’(s) < 26; hence 
os) = P(s)Q(s) < 36, and p(s) ~O0ass— ~. 

By the definition of p(s), N.—.:«)(P(s)) C N.(Q(s)) C €(Q, A). Using Lemma 
6.2, we see that C(P, A) = >> Ns-p(P(s)) C C(Q, A). If we interchange the 


roles of P and Q, C(Q, A) Cc C(P, A). Hence C(P, A) = C(Q, A), and the 
lemma is established. 


THEOREM 6.3. The horocycle C(P, A) has a continuously turning tangent at 
wery finite point and is «n orthogonal trajectory of the geodesics with A as point 
at infinity. 


From Lemma 6.3 we see that it is sufficient to prove the theorem for the 
point P. 

On account of the convexity of C(P, A), as x > 0, 9(P, (x, y"(x))) must rotate 
in the counterclockwise sense and must therefore approach a limiting geodesic 
through P. Similarly, the geodesics determined by P and the points of y (x) 
approach a limiting geodesic through P. Since C(P, A) is the limit of circles 
through P perpendicular to g(P, A), these limits are the same and coincide 
with the geodesic through P perpendicular to g(P, A). Thus C(P, A) has a 
tangent direction at each point and is an orthogonal trajectory of the field of 
geodesics with A as point at infinity. 

That the curve has a continuously turning tangent is now an immediate 
consequence of the continuous variation of the geodesics determined by A and 
points which approach P along the continuous curve C(P, A). 


7. Approximation properties of the horocycles. A horocycle C(P, A) is 
determined by a point P of W and a point A of U. In this section it will be 
shown that a point of C(P, A) is approximated arbitrarily closely by points of 
horocycles determined by P’ near P and A’ near A. The following theorem is 
easily proved (cf. Busemann [3], p. 145). 








220 ANNA GRANT 


THEOREM 7.1. <A point Q is on C(P, A) if and only if it is the limit point of 
points on circles with centers on gr(P, A) and passing through P, as the radii of 
the circles become infinite. 

THEOREM 7.2. Given any point Q of the horocycle C(P, A) and ¢« > 0, there 
exists a 6 > O such that every circle with center within distance 6 of A and having 
on it a point within distance 6 of P has on it a point within distance ¢€ of Q. Con- 
versely, if Q is a point of V and a limit point of points on a sequence of circles C, 
with centers approaching A, and if C, has on it a point P,, such that lim P,, = P, 


Q is on C(P, A). 

It will first be shown that given « > 0, there exists a 6 > 0 such that if R 
is any point of V at distance less than 6 from A, | RP — RQ| < }e. For itis 
clear from Theorem 5.2 that A is a point at infinity of E(P, Q), therefore 
E(P, Q) has on it points arbitrarily near A and between gr(P, A) and gr(Q, A) 
thus arbitrarily close to both of these rays. Let M C E(P, Q) and be such 
that gr(P, A) and gr(Q, A) both cross an }eneighborhood of M. Then é > 0 
can be chosen so small that if RA < 6, gs(P, R) and gs(Q, R) will have points 
S and T, respectively, in the }eneighborhood of M. But then 


|RP — RQ| =|RS+ SP — RT — TQ| 
< | RS — RT| +|SP — MP| +|MQ-TQ| <} 


~~ 


Now let 6 > 0 also be chosen less than $e. Let C be a circle with center R 
within distance 5 of A and having on it a point N within distance 6 of P. Then 
we have 


|RN — RQ| <| RN — RP| +|RP— RQ\ <i +}e+e 


The first statement of the theorem is proved. 

To prove the remainder of the theorem, suppose that the stated conditions 
are satisfied and Q is not on C(P, A). Then A cannot bea point at infinity of 
E(P, Q); for if it were, there would be points arbitrarily close to A and between 
gr(P, A) and gr(Q, A), thus arbitrarily close to gr(P, A), and at the same 
distance from P and Q. But then Q would be a limit point of points on cireles 
with centers on gr(P, A) and with radii arbitrarily large, and according to 
Theorem 7.1 would be on C(P, A). 

If A is not a point at infinity of E(P, Q), there exist according to Theorem 
5.2 a6 > 0 and a neighborhood of A such that if R is any point in W and in this 
neighborhood, | RP — RQ| > 6. But then! RP, — RQ| > 6 —|RP, — RP 
and there exists an N such that 


RP, — RQ| > 36 > 0, n>N. 


The point Q cannot be a limit point of the stated kind, and the proof of the 


theorem is complete. 








it of 
of 


here 
ving 
on- 
sC, 
: P 


if R 
it is 
fore 
, A) 
such 
>0 
ints 


‘hen 


ions 
y of 
yeen 
ame 
‘cles 


yr tO 


rem 
this 
RP 


 N. 
the 





SURFACES OF NEGATIVE CURVATURE AND TRANSITIVITY 221 


THEOREM 7.3. Given any point Q of the horocycle C(P, A) and « > 0, there 
exists a 6 > O such that every horocycle with point at infinity within distance 6 
of A and having on it a point within distance 6 of P has on it a point within dis- 
tance « of Q. Conversely, if Q is a limit point of points on a sequence of horocycles 
C(P,, An) such that lim P,, = P and lim A, = A, Q is on C(P, A). 


n->o n->o 


Let Q be a point of C(P, A). Given e > 0, let 6 > 0 be chosen in accordance 
with Theorem 7.2. Let C(M, B) be a horocycle such that MP < 6 and 
AB < 6. Then if R is on gr(M, B), for all MR exceeding some constant, 
RA < 6. According to Theorem 7.2 every circle with center at such a point R 
and passing through M has on it a point S, within distance « of Q@. AsR 
approaches B, the set S, must have a point of accumulation S within distance e 
of Q, and it follows from Theorem 7.1 that S is a point of C(M, B). The first 
part of the theorem is proved. 

If Q is a limit point of points on a sequence of horocycles C(P, , An») such 
that lim P,, = P and lim A, = A,by making use of the first part of Theorem 7.1, 


no n-*2x 
we infer that Q is a limit point of points on a sequence of circles C, with centers 
approaching A and such that P, is on C,. The stated result then follows 
from Theorem 7.2. 


8. Further geometric properties of the horocycles. 


THEOREM 8.1. Given two points P and Q of WV, there are just two horocycles 
containing both P and Q. The points at infinity of these horocycles are the points 
at infinity of E(P, Q). 


Let A and B be the points at infinity of E(P, Q) and let M, be the point of 
gr(P, A) at distance r from P and C, the circle with center M, and radius r. 
Then P is on C,. Since E(P, Q) lies in the region bounded by gr(P, B), 
gr(Q, B), gr(P, A), and gr(Q, A), given « > 0, there exists an 7 such that if 
r> 7, there is a point N,, on E(P, Q) and within distance « of M,. Then 


M.Q — M.P| <|M.Q—N.Q|+|N.Q—N,P|+|N,P — M,P| < 26. 


The point Q is a limit point of points of circles C, with arbitrarily large radius. 
From Theorem 7.1, Q must lie on C(P, A). 

A similar proof applies to C(P, B) and the existence of two horocycles with 
the stated properties is proved. 

To complete the proof it is sufficient to show that any horocycle which con- 
tains P and has neither A nor B as its point at infinity cannot contain Q. Let 
such a horocycle be C(P, D). According to Theorem 5.2, there exist a neigh- 
borhood of D and a 6 > O such that if R is any point of W in this neighborhood, 
RP — RQ)| > 6 But then Q cannot be a limit point of points on circles 
through P, with centers on gr(P, D) and arbitrarily large radii, and conse- 
quently, from Theorem 7.1, Q cannot be on C(P, D). 








222 ANNA GRANT 


THEOREM 8.2. There is one and only one point of a horocycle with A as point 
at infinity on each geodesic with A as point at infinity. 

Let P be any point of the given horocycle and let g(C, A) be any geodesic 
with A as point at infinity. Since g(C, A) and g(P, A) are asymptotic, there 
are on g(C, A) points which are interior to C(P, A). But A is the only point 
of U on C(P, A), therefore C(P, A) must cut g(C, A) in a finite point. 

If there were two such points, say Q and Q’, then C(Q, A) = C(P, A) = 
C(Q’, A), and this is impossible. The stated result is proved. 

The following theorem is implied by more general considerations of Busemann 
((3], pp. 145-146). 

THEOREM 8.3. Two horocycles with the same point at infinity A cut off equal 
intercepts on the geodesics with point at infinity A. 

THEOREM 8.4. The finite points of intersection of a geodesic and a horocycle 
fulfill one, and only one, of the following conditions: 

(a) there are none; 

(b) there is one, in which case the geodesic and horocycle are either tangent at 
the common point, or orthogonal at the common point and have the same point at 
infinity ; 

(c) there are two, in which case the geodesic and horocycle are neither tangent 
nor orthogonal. 

Let the points at infinity of the geodesic be B and C and let the horocycle 
be C(P, A). If A coincides with either B or C, it follows from Theorem 8.2 
that g(B, C) and C(P, A) have just one point in common and from Theorem 6.3 
that they are orthogonal at this point. This is one of the possibilities under (b). 

Assuming that A does not coincide with either B or C, we see that there exists 
a point D of U' such that g(A, D) and g(B, C) are orthogonal. The horocycle 
C(P, A) and the geodesic g(A, D) have just one point Q in common, and if we 
make Q the origin and gr(Q, A) the positive z-axis in a geodesic normal coérdi- 
nate system, C(P, A) is given by functions of class C’, y = y"(x) andy = y (2), 
x =0,y'(0) = y (0) = 0. But the codrdinate system has been so chosen that 
g(B, C) has the equation x = #. It follows that if # < 0, g(B, C) and C(P, A) 
have no point incommon. If @ = 0, g(B, C) and C(P, A) have just one point Q 
in common, and since g(B, C) is orthogonal to gr(Q, A) and the same is true 
of C(P, A), g(B, C) and C(P, A) must be tangent. Finally, if @ > 0, y (#) < 
0 < y (2) and g(B, C) and C(P, A) have just two points in common. At 
neither of these points can g(B, C) and C(P, A) be normal, for this would imply 
the coincidence of A with either B or C. At neither of these points can they 
be tangent, for then one of the functions y (x) or y"(x) would not have a deriva- 


tive at z = Z. 


9. Element approximation. Since a horocycle is a curve of class C’, if 
oriented it bears an element at each of its points. Various theorems concerning 
element approximation are needed for the derivation of the permanent regional 
transitivity, and these will be treated now. 











nn 





SURFACES OF NEGATIVE CURVATURE AND TRANSITIVITY 223 


A circle is a right circle if it is so oriented that if P is a point of it and R is 
its center, the relation of the initial direction of the directed segment gs(P, R) 
to the direction of the circle at P is the same as that of the positive z-axis to 
the positive y-axis. The circle with the opposite orientation is a left circle. 

A horocycle is a right horocycle if it is so oriented that if P is a point of it 
and A is its point at infinity, the relation of the initial direction of the directed 
geodesic ray gr(P, A) to the direction of the horocycle at P is the same as that 
of the positive z-axis to the positive y-axis. The notation will be Cx(P, A). 
A left horocycle has the opposite orientation and the notation C,(P, A). 

It is to be understood in the following, and indeed it was implicit before, that 
whenever the notations gs(P, Q), gr(P, A), g(P, Q), g(P, A), g(A, B), where P 
and Q are points of ¥ and A and B are points of U’, are used in connection with 
oriented geodesic segments, geodesic rays, or geodesics, the orientation is such 
that the order of the two determining points is that given. 


THEOREM 9.1. Any element of a right (left) horocycle C(P, A) is the limit 
element of elements on right (left) circles through points approaching P and with 
centers approaching A. Conversely, if an element is a limit element of elements 
on such right (left) circles, it is an element of the right (left) horocycle C(P, A). 


Let q be the element of Cx(P, A) at the point Q. The direction of q is then 
perpendicular to the direction of the initial element e of gr(P, A) and the rela- 
tion of e to qg is the same as that of the positive z-axis to the positive y-axis. 
Let C, (n = 1, 2, --- ) be a sequence of circles with centers R, (n = 1, 2, --- ) 
such that lim R, = A and let P, C C, (n = 1, 2, ---) such that lim P, = P. 


n—-2 no 
It follows from Theorem 7.2 that there exists a sequence of points Q, ,Q,C C,, 
such that lim Q, = Q. The initial element e, of gs(Q, , R,) approaches e as n 


n--o 
becomes infinite. if C, is oriented so that it is a right circle, the element gq, 
of it at Q, is perpendicular to e, and e, has the same relation to q, that the 
positive z-axis has to the positive y-axis. It follows that lim g, = q, and the 


frst part of the theorem is proved in the case of right horocycles and right 
circles. The proof in the case of left is similar. 

Let q at the point Q be an element satisfying the condition with respect to 
night circles stated in the second part of the theorem. It immediately follows 
from Theorem 7.2 that Q C C(P, A). If the right circles are C, , with centers 
R,, and Q, C C, such that lim Q, = Q, it is easily seen that the element q, 


n—-2 
of C,, at Q, must approach qg. Similar results apply to the case of left circles, 
and the proof of the theorem is complete. 

The point P divides the horocycle C(P, A) into two parts, one part consisting 
oi P and those points of C(P, A) on one side of g(P, A), the other part con- 
sisting of P and the points of C(P, A) on the other side of g(P, A). . Each of 
these parts will be termed a semihorocycle. In the notation of §6, one of these 
emihorocycles is given by y = y (x), the other by y = y (x). The point A 
will be called the point at infinity of either semihorocycle. 














224 ANNA GRANT 


If C(P, A) is oriented, the two semihorocycles have an orientation induced 
in them, and one of these will have its initial point at P. If C(P, A) is so 
oriented that it becomes a right (left) horocycle, that semihorocycle which has 
its initial point at P will be called a right (left) semthorocycle and will be denoted 
by SC2(P, A) {SC.(P, A)}. Again if the notation of §6is used, SC2(P, A) is 
given by y = y (x), with initial point P, and SC_,(P, A) is given by y = y (z), 
with initial point P. 

THEOREM 9.2. The right (left) semthorocycles with initial point P form a 
field in V except at P. 

Let Q be any point of WY other than P. Then according to Theorem 8.1 
there are just two horocycles containing both P and Q and the points at infinity 
A and B of these horocycles are the points at infinity of the set E(P, Q). The 
point P divides each of these horocycles into two semihorocycles of which just 
one contains Q. Thus there are just two oriented semihorocycles with initial 
point P which contain Q. To complete the proof it is sufficient to show that 
one of these is a right and the other a left semihorocycle. 

Neither A nor B can coincide with the points at infinity of g(P, Q). We 
can assume that the notation has been so chosen that if the initial element of 
gr(P, A) is rotated in the clockwise direction until it coincides with the initial 
element of gr(P, Q), the angle of rotation is less than 7. It was shown in $5 
that A and B lie on opposite sides of g(P, Q). Therefore if the initial element 
of gr(P, B) is rotated in the counterclockwise direction until it coincides with 
the initial element of gr(P, Q), the angle of rotation is less than 7. But then 
in the normal geodesic coérdinate system with P as origin and gr(P, A) as 
positive z-axis, Q has a negative y-coérdinate, while in the normal geodesic 
coérdinate system with P as origin and gr(P, B) as positive z-axis, Q has a 
positive y-coérdinate. Thus one of the oriented semihorocycles which have initial 
point P and which contain Q is a right and the other a left semihorocyele. 
The proof is complete. 

Let R be the point in which a horocycle with point at infinity A intersects 
the geodesic g(0, A) determined by the origin O and the point A. Let distance 
on g(O, A) be measured from O and negative in the direction of A. Then R 
has a coérdinate r. Conversely, given A and r, the horocycle is completely 
determined, for it is the horocycle C(R, A). The number r will be called the 
radius of the horocycle and the horocycle determined by r and A will be denoted 
by C(r, A). 

We may state without proof that the circle with center O and radius |r 
contains no points of the horocycle C{r, A) in its interior. 


THEeorEM 9.3. Let Ce(rn, An) (n = 1, 2,---) be a sequence of right horo- 


cycles such that lim r, = r and lim A, = A. If p is an arbitrary element of 
Cx(r, A), there exist elements p, of Cr(rn , An) (n = 1, 2, --- ) such that lim p, = P. 


The corresponding result obtained when right is replaced by left holds. 








ced 


; $0 
has 
ted 
) is 
(2), 


ma 


8.1 
nity 
The 
just 
itial 
that 


We 
it of 
Litial 
n § 
nent 
with 
then 
|) as 
desic 
1as a 
nitial 
-ycle. 


rsects 
tance 
en R 
letely 
d the 
noted 


horo- 
ent of 





SURFACES OF NEGATIVE CURVATURE AND TRANSITIVITY 225 


Let R be the point where C,(r, A) cuts g(O, A) and let R,, be the point where 
Cz(rn, An) cuts g(O, A,). Since, with increasing n, r, approaches r and any 
finite segment of g(O, A) is approximated uniformly closely by g(O, A,), neces- 
sarily lim R, = R. But then, from Theorem 7.3, if P is the point bearing p, 


n—>2 


there exist points P, (n = 1, 2,--- ), Px» C Cr(r,, An), such that lim P, = P. 


n> 


It follows that if e, denotes the initial element of gr(P, , An), lim e, = e, the 


n~->o2o 
initial element of gr(P, A). The element p, of Cr(r,, An) at P, is obtained 
from e, by rotating e, through the angle 3x. Since p is obtained from e by a 
rotation of 37, it follows that lim p, = p, and this is the stated result. The 


n-?o 


similar proof for left horocycles is omitted. 


THEOREM 9.4. If the sequence of elements p, (n = 1, 2, --- ) approaches the 
element p as n becomes infinite, and if the right (left) horocycles determined by p, 


and p have points at infinity A, and A and radii r, and r, respectively, lim A, = A 
n-?>o 
and lim r, = r. 
n-?o 


The proof will be carried through only for the case of right horocycles. The 
proof for left horocycles is entirely similar. 

Let P, be the point bearing p, and let P be the point bearing p. Then the 
initial element e, of the directed geodesic ray gr(P, , A,) is perpendicular to p, 
and the relation of e, to p, is the same as that of the positive z-axis to the 
positive y-axis. The same relationship holds between e, the initial element of 
g(P, A) and p. Since lim p, = p, it follows that lim e, = e and consequently 


n->2 n> 


lim A, = A. 


The set |r, | is a bounded set. For the circle with center at the origin O 
and radius 7, does not contain P, in its interior and consequently |r, | < OP, . 
Since OP, S OP + PP,,, and the set PP, is bounded, it follows that the set 

r,| (n = 1, 2, --- ) is a bounded set. 

Let Q, be the point in which C(P, , A,) intersects g(O, A,) and let Q be the 
point where C(P, A) intersects g(O, A). Then lim Q, = Q. For since the 
set |r, | = OQ, is bounded and lim A, = A, the only limit points of the set 
Q. must be on g(O, A). Suppose that there exists a subsequence Q,, 
(( = 1, 2,--- ) such that lim Q,, = Q* # Q. Then Q* is on g(O, A), and 


since Q is the only point of C(P, A) on g(O, A), Q* is not on C(P, A). But 

then, as in the proof of Theorem 7.2, A is not a point at infinity of E(P, Q*) 

and there exists a neighborhood 7 of A such that for any point S of ¥ in 7 

SP — SQ* | exceeds a positive constant, independent of S. Since lim P, = P, 
n—-2 

lim Q,, = Q* and lim A, = A, there exist points S in » for which | SP — SQ* | 


nu—-?>2 











226 ANNA GRANT 


is less than a given constant. Therefore Q* must coincide with Q, and so 
lim Q, = Q. It follows at once that lim r, = r. 


n-?o n->2 


THEeoREM 9.5. Let C(r,, An) (n = 1, 2, --- ) be a@ sequence of horocycles such 
that lim A, = A and limr, = + and let B and D be points of U distinct from A. 
Then there exists an i such that for n > fi, C(r,, An) intersects g(B, D) in two 
points. As n becomes infinite, one of the points approaches B, the other D, and 


the angle of intersection at each of these points approaches }r. 


Let A be an interval of U’ with A as midpoint and such that neither B nor D 
is in A. Let # be so chosen that for n > 7, A, is in A and 7, is greater than 
the minimum distance from O to g(B, D). Then for n > %, the circle with 
center O and radius r, intersects g(B, D) in two points P, and Q,. Since all 
the interior points of the segment P,Q, are interior points of this circle, they 
are interior points of C(r, , A,). Since B and D are exterior points of C(r, , A,), 
assuming that the order of points on g(B, D) is BP,.Q,D, we see that each of the 
geodesic rays P,,.B and Q,D intersects C(r,, A,). Let these points of intersection 
be S, and T’,, respectively. Since lim r, = +2, the segment P,Q, becomes 


n—?% 


infinite as n becomes infinite and eventually includes any finite segment of 
g(B, D). Thus lim P, = B, lim Q, = D, and consequently lim S, = B and 


n--2 n-*o n-—-o 


lim 7’, = D. 
n-?o 


As n becomes infinite, the angle at S, between gr(S,, A) and gr(S, , D) 
approaches zero. Given e > 0, there exists an integer nm such that this angle 
at S,, is less than $e. Let E be so chosen on U—but not on the are ADB— 
and so near A that the angle at S,, between gr(S,, , A) and gr(Spa, , EZ) is less 
than $e. For n sufficiently large, the point S, will lie on gr(S,, , B) and A, 
will lie on the are DAE. But then gr(S, , A,) will cross gr(S,, , Z) in some 
point V,. In the geodesic triangle V,,S,,S, , the angle ¢ at S, is less than the 
exterior angle ES,,D and this is less than «. The angle ¢ is the angle between 
the geodesic rays gr(S,, A,) and gr(S,, D) and thus is less than «. Since 
C(r,, An) is normal to gr(S, , A,) at S, , the stated result with respect to the 
angle of intersection at S, follows. A similar proof holds for the angle of inter- 
section at 7’, , and the proof is complete. 


10. Permanent regional transitivity. Let F be a Fuchsian group with princi- 
pal circle U and let \*(u, v) of (2.1) be invariant under F. Then the metric 
(2.1) will be invariant under F, the corresponding geodesics will be invariant 
and, since Euclidean angle is invariant under such transformations and angle 
defined by (2.1) is Euclidean, angle will be invariant. 

If points congruent under F are considered identical, there is defined a two- 
dimensional manifold M of negative curvature which may or may not be closed, 
depending on the nature of the group F. Just as in the case of constant nega- 
tive curvature (cf. Hedlund [6], p. 539) the directed geodesics define a flow 














SURFACES OF NEGATIVE CURVATURE AND TRANSITIVITY 227 


in the space E of elements. It was shown in the paper just referred to that in 
the case of constant negative curvature and Fuchsian group of the first kind 
the flow has the property of permanent regional transitivity; that is, given any 
two open sets in the space of elements, either one of these eventually flows 
into the other. Furthermore, this intersection is permanent in the sense that 
it holds permanently after a certain time. A further result needed to extend 
this result to the case of variable negative curvature is the following theorem, 
which is obvious in the case of constant curvature. 

If a geodesic g is periodic on M, there must be a transformation T of F taking 
g into itself. The points at infinity of g must then be fixed points of T and T 
isa hyperbolic transformation which leaves g invariant. 


THEOREM 10.1. On a periodic geodesic g(B, D) let there be given a sequence of 
congruent points P,, (n = 1, 2, --- ) such that lim P, = D, and let P be any point 


of V. Then the angle at P,, between g(B, D) and the right (left) semthorocycles 
with initial point P and passing through P, approaches }x as n becomes infinite. 

Let A, be the point at infinity of SCx(P, P,). Then gr(P,, Ax) is per- 
pendicular to SCz(P, P,) at P,. To prove the theorem it is sufficient to show 
that if a, is the angle between the initial elements of gr(P, , A») and gr(P, , B), 


lim a, = 0. As n becomes infinite the angle between the initial elements of 

ar) 

gs(P,,, P) and gr(P, , B) approaches zero, so that lim a, = 0 if lim 6, = 0, 
n-—?o n-*o 


where 8, is the angle between the initial elements of gs(P, , P) and gr(P, , A,). 

Since g(B, D) is periodic and the points P, are congruent, there exists a 
transformation 7’ of F such that under 7*‘" the points P, can be carried back 
toa fixed point S of g(B, D). Since lim P, = D, either lim N(n) = + or 


n-*o n->o 


lim N(n) = —~, and the notation is assumed such that the first is the case. 
r~a 
Then lim T*‘"(P) = B. We show that if A, = 7*‘"(A,), lim A, = B. 
no no 
If this were not the case, there would exist a sequence n; (¢ = 1, 2, --- ) of 


“- 
integers such that lim A‘, = C, where C is not identical with B. Let GH be a 


‘oo 


closed interval of U’ with C as center and so small that B is an exterior point 


ofit. Then SC.(S, G), SC.(S, H), and GH form a simple closed curve bounding 
a region of which B is an exterior point. There exists a neighborhood of B 
such that any left semihorocycle with initial point S and point at infinity 


“- 
in GH has no point in this neighborhood. Since there exists an J such that 


fori > J, A a is in GH, all the left horocycles T*‘"’ SC,(P, , A,), i > I, are 
such that there is no point of any one of them in the stated neighborhood of B. 
But T*‘" (P) is on T’’” SC.(P,, A,) and the fact that lim 7%” (P) = B is 


n—?o 


contradicted. We conclude that lim A S = §. 


n-?o 








228 ANNA GRANT 


Since 7 and its powers preserve angle, the angle 8, is equal to the angle 
between the initial elements of gs(.S, T7”‘”’(P)) and gr(S, A .). Since lim A; =B 


n--2o 


and lim 7”‘"(P) = B, we conclude that the last angle approaches zero and 
thus lim 8, = 0. This completes the proof of the theorem. 


The theorems which allow the application of the methods of Hedlund {6 
in the case of constant negative curvature to the case of variable negative 
curvature are now at hand. The following theorems of Hedlund hold for a 
manifold of variable negative curvature and a Fuchsian group of the first kind. 

THEOREM 1.1’. If the group F is a Fuchsian group of the first kind with prinei- 
pal circle U, P is an arbitrary point of V, and AB is an arbitrary interval of U, 
then there exist points C and D of AB such that SC,(P, C) and SC;(P, D) are 
both transitive. 

THEOREM 1.2’. If F is a Fuchsian group of the first kind, there exists an infinite 
set of transitive directed horocycles through any point of ¥. The points at infinity 
of these transitive horocycles form an everywhere dense set on U. 

THEOREM 2.1’. If one directed horocycle with A as point at infinity is transi- 
tive, all the directed horocycles with A as point at infinity are transitive. 

TuHeoreM 2.2’. If F is a Fuchsian group of the first kind, the end points of all 
axes of hyperbolic transformations of F are h-transitive. 

THEOREM 2.3’. If F is a Fuchsian group of the first kind and there are copies 
of the horocycle C(r, A) with radii arbitrarily large, A is h-transitive. 

TueoremM 2.4’. Let F be a Fuchsian group of the first kind, A a point of U 
and gr(O, A) the geodesic ray with origin O as initial point and with A as point 
at infinity. If there exists on gr(O, A) a sequence of points Oo, O,, Oo, ->: 
such that lim OO, = + and such that O, has a copy 0%, (n = 0, 1, 2, ---) 


with OO, bounded, n arbitrary, A is h-transitive. 

TuHeoreM 2.5’. If F has a fundamental region R which together with tts 
boundary lies entirely interior to U, all points of U are h-transitive. 

TueoreM 2.6’. If F is of the first kind and if the only boundary points of R 
on U are parabolic points, all points of U, with the exception of those which are 
fixed points of parabolic transformations of F, are h-transitive. 


TuHeoreM 3.1’. If F is of the first kind, the flow defined by the geodesics has the 
property of permanent regional transitivity. 


BIBLIOGRAPHY 


1. L. Bresersacn, Uber Tchebychefsche Netze auf Flichen negativer Kriimmung, sowie auf 
einigen weiteren Fléchenarten, Sitzungsberichte der Preussischen Akademie der 
Wissenschaften, 1926, XXIII, pp. 294-300. 

2. G. A. Buss, The Calculus of Variations, Carus Monograph, Chicago, 1925. 











d {6} 
ative 
for a 
kind 
"INCI 
of U, 


) are 


finite 
finity 


ansi- 
of all 
opres 
of U 
point 


’ 


vie auf 
ie der 








10. 


.H 


M 


B 


SURFACES OF NEGATIVE CURVATURE AND TRANSITIVITY 229 


. Busemann, Uber die Geometrien, in denen die Kreise mit unendlichem Radius die 
kiirzesten Linien sind, Mathematische Annalen, vol. 106(1932), pp. 140-160. 


. HADAMARD, Les surfaces a courbures opposées et leur lignes géodésiques, Journal de 


Mathématiques pures et appliquées, (5), vol. 4(1898), pp. 27-73. 


A. Hepuunp, Two-dimensional manifolds and transitivity, Annals of Mathematics, 
vol. 37(1936), pp. 5384-542. 
A. Hepiunp, Fuchsian groups and transitive horocycles, this Journal, vol. 2(1936), 


pp. 530-542. 


;. A. Hepiunp, The dynamics of geodesic flows, Bulletin of the American Mathematical 


Society, vol. 45 (1939), pp. 241-260. 

. Hopr anp W. Rinow, Ueber den Begriff der vollstindigen differentialgeometrischen 
Fléche, Commentarii Mathematici Helvetici, vol. 3(1931), pp. 209-225. 

. Morse, A fundamental class of geodesics on any closed surface of genus greater than 
one, Transactions of the American Mathematical Society, vol. 26(1924), pp. 
25-61. 

. Morse, Instability and transitivity, Journal de Mathématiques pures et appliquées, 
(9), vol. 14(1935), pp. 49-71. 


RYN Mawr COLLEGE. 











THE MEASURE OF GEODESIC TYPES ON SURFACES OF NEGATIVE 
CURVATURE 


By Gustav A. HEDLUND 


1. Introduction. Various problems arise in connection with the transitivity 
of a dynamical system. The first of these concerns the existence of transitive 
motions. Secondly, if there exist such motions, what is the measure of the 
totality of transitive motions? Thirdly, is the system metrically transitive? 
If the last holds, almost all the motions are transitive, so that each of the first 
two of the above properties is a consequence of the following one. 

The first of these problems has been solved in the case of the geodesic problem 
on a class of surfaces of negative curvature and even for surfaces with some 
positive or zero curvature, provided there is not enough to destroy the instability 
of the geodesics (cf. Morse [1], Hedlund [1]). The second and third problems 
have been solved only for a restricted subclass of these surfaces, namely, those 
of constant negative curvature, of finite area, and of finite connectivity (ef. 
E. Hopf [1]). The constancy of the curvature plays an important rdle in the 
proofs of these results, for it implies that certain transformations are analytic 
and thus transform sets of measure zero into sets of measure zero. It is the 
lack of information concerning the corresponding transformations in the variable 
case which causes the difficulty in applying the methods of the constant case. 

This paper gives a solution of the second problem for a class of surfaces of 
negative curvature, not necessarily constant, of finite area and finite connec- 
tivity. This class includes, in particular, all closed orientable surfaces of nega- 
tive curvature. It also includes surfaces with “‘parabolic’’ openings. It is 
shown that on all surfaces of the stated class almost all the geodesics are transi- 
tive. The extension of this result to non-orientable surfaces is easily proved. 

In the definition of the preceding class of surfaces, a Fuchsian group of the 
first kind is used. If the defining Fuchsian group is of the second kind, the 
geodesics behave in an entirely different manner. It is shown that in this case 
almost all the geodesics are unstable in the sense that for both future and past 
time they eventually remain outside any fixed finite region of the surface. 
(Cf. E. Hopf [1] for results of this kind in the case of constant negative curva- 
ture.) This class includes surfaces with “hyperbolic” openings similar to the 
surfaces which Hadamard (cf. Hadamard [1]) constructed. The class does not 
include all these Hadamard surfaces, however, so that it is not possible to say 
that in all cases the perfect sets of geodesics discovered by Hadamard are sets 
of measure zero. This problem will be taken up in a later paper. 

The method used is similar to that used by Tuller (ef. Tuller [1]) in attaining 


Received August 24, 1938. 








TIVE 


tivity 
sitive 
f the 
itive? 


e first 


»blem 
some 
bility 
blems 
those 
v (ef. 
in the 
alytic 
is the 
riable 
Ase. 
ces of 
nnec- 
nega- 
It is 
ransi- 
ed. 
of the 
1, the 
Ss case 
1 past 
rface. 
‘urva- 
‘o the 
Ps not 
‘0 say 
e sets 


rining 





GEODESIC TYPES ON SURFACES OF NEGATIVE CURVATURE 231 


similar results for three-dimensional manifolds of constant negative curvature. 
It was first necessary to extend various results concerning horocycles (cf. Hed- 
lund [2]) to the case of variable negative curvature, work which has been carried 
through by Grant. (Cf. the preceding paper of this Journal. This paper will 
hereafter be referred to as Grant.) The present paper derives the added prop- 
erties which are needed to solve the stated problems. 


2. A class of simply-connected two-dimensional manifolds of negative curva- 
ture. Let U be the unit circle, vu + v0 = 1, and let V be its interior. The metric 
in V will be defined by 


»  A*(u, v)(du® + dv’) 


23) is’ = : 
oad (1 — u? — »’)? 
where A(u, v) is of class C’, k = 7, and 

(H.1) 0<asXu,v) Sb 


inv. The total or Gaussian curvature K(u, v) of (2.1) is then of at least class C* 
in VY, and it will be assumed that A(u, v) is such that in ¥ 


(H.2) —d < K(u,v) s -—c <0, 


where c and d are positive constants. The condition (H.2) is satisfied, in par- 
ticular, if A(u, v) is constant, for then the curvature is constant and negative. 

As a consequence of the hypothesis of negative curvature, the geodesics have 
many properties in common with the hyperbolic lines (i.e., the geodesies when 
\ = const.). There is a unique geodesic segment joining two points P; and P, 
of W and this geodesic yields a minimum of are length with respect to all recti- 
fiable curves lying in V and joining P; and P;. Thus no two geodesics inter- 
sect more than once, all geodesic segments are of class A, all unending geodesics 
are of class A and each is of the type of a hyperbolic line. There is just one 
unending geodesic of the type of a given hyperbolic line, so the geodesics and 
hyperbolic lines are in one-to-one correspondence. The geodesics through a 
point P of v form a field in V except at P. Thus it is possible to set up geodesic 
polar coérdinates with any point P of WV as center. 

It has been shown by Grant (cf. Grant, §6, et seq.) that, under the conditions 
imposed, there exist curves called horocycles which display the properties of 
the horocycles in the case of constant curvature. In particular, a horocycle is a 
limiting curve of geodesic circles with radii becoming infinite and is an orthog- 
onal trajectory of the geodesics which have a common point at infinity. 

The distance D(P;, P2) between two points P; and P, of ¥ will be defined 
as the are length (as determined by (2.1)) of the unique geodesic segment joining 
P, and P2. 

An element e(u, v, 0), u + v° < 1,0 S @ < 2x, determines a point P(u, v) of 
VY and a direction @ at that point, where the angle @ is to be measured in the 
counterclockwise sense from a direction parallel to the positive u-axis. Con- 











232 GUSTAV A. HEDLUND 


versely, a point of ¥ and a direction at this point determines a unique element. 
The distance D,(e; , e2) between elements e:(u , v1, 4:) and é€2(u2, ve, 62) will 
be defined by 
D,{e1 , €2) = D(P1, P2) + {02 — A}, 

where P, is the point (uw, ), Pz is the point (uz, v2) and {#2 — 6,} is the least 
value of the set | @. — 6; + 2nx|(n = 0, +1,---). The space of elements 
e(u, v, 0), u +v < 1,0 S 6 < 2z, will be denoted by E. The volume in E 
will be defined by the integral 


r ‘ , 
(2.2) [I lG ee ayy du do db, 


and measurability and measure will be that determined by this definition of 


volume. 

If we interpret time as are length along the geodesics, the geodesics define a 
flow of the space E into itself. It is easily shown that the integral (2.2) and 
hence measure are invariant under this flow. 


3. A differential geometric identity. Since the geodesics through a point P 
of V form a field in V except at P, it is possible to set up geodesic polar co- 
ordinates with P as center. The first fundamental form is then given by 


(3.1) ds’ = dr’ + G(r, o)d¢’, 


where, because of the hypotheses on the original quadratic form, the non-nega- 
° ° ° 4 ° » e4 ° 
tive function G(r, ¢) is of at least class C’ in the region 


(R) Osr<+n, -x <gcteu 


and 
G(r,@ + 2nx) = G(r, do) (n = 0, 1, +2, ---). 


The following boundary conditions are satisfied 
G 

(3.2) G(0, ¢) = 0, (%) =1, 
or /r=0 


and G(r, ¢) satisfies the differential equation 
(3.3) cS +K-G =0, 
y 


where K = K(r, ¢) is the Gaussian curvature and is of at least class C* in R. 
With the aid of (3.2) and (3.3) we see that 
Goo(0, $) 


Go(0, d) = Gree(0, d) = Goee(0, o) = 0. 


G,(0, ) = G,,.(0, ¢) = G,6(0, ¢) 








ent. 
will 


east 
ents 


in E 


n of 


ine a 
and 


nt P 


r co- 


1ega- 


in R- 





GEODESIC TYPES ON SURFACES OF NEGATIVE CURVATURE 233 


Thus 
(3.4) G(r, ¢) = r + 7r°R(r, 9), G.(r, ) = r°R*(r, $) 


in R, where | R(r, ¢) | and | R*(r, ¢)| are bounded functions in any region 
0srsb,-—~ << +, wheres > 0. 

In the case of constant negative curvature, G(r, @) is a function of r alone 
and the expression for it in terms of exponential functions is readily determined 
from (3.3). In this case the fundamental identity 


1. pm {fx 2 i , 
(1) Golr, 6) = —G0, 8) | eae Bd, Kell 0G, oathds, 1 >0, 


where one of the integrals on the right is an improper integral, is trivial, for 
both sides vanish identically. We show that the identity holds in the general 
case under consideration. 

To that end, let 


1 ft, 
oe G(t, 4) dt, 
Ms) = Gace > | K,(t, 6)G°(t, $) dt s>0 


Then A(s) is continuous, s > 0. It follows from (3.4) and the boundedness of 
R(r, ¢) in a neighborhood of r = 0 that there exists a 6 > 0 such that 


G(r, ¢) > 4, Gir, o) < 3’, Osrsé. 
Let B be an upper bound of | A,(r, ¢)|for0 S r S$ 6. Then forO0 <s $6 
we have 
3t° 


9 8 
|h(s)| S$ = B | — dt = Bs. 
s- 0 | 


It follows that lim h(s) = 0 and consequently the integral on the right in (I) 


s—0 
exists. 
If we differentiate (3.3) partially with respect to ¢, there results 
Gre = —K,G — KG,, 


and thus 
[ Kelt, 06, 0) at = —[ 160, Gull, 0) — Galt, G00, 0) at 
By integration by parts, we obtain 
[ K,(t, ¢)G(t, ¢) dt = —G(s, )G.g(s, ¢) + G(s, 9)G,(s, 9), 


and thus 


| 


h(s) == — 
= &(s, 6) 


Se 8) | 


Y Y S = an — 
{—G(s, $)G.o(s, ) + Gils, d)Go(s, $)} = ak ¢) 











234 GUSTAV A. HEDLUND 
Thus the right side of (1) is equal to 


Making use of (3.4), we have 


Go(e, o) = é R*(e, o) 
G(e, >) 1+e€R(e,¢)’ 


and since | R*(e,@) | and | R(e, ¢) | are bounded in 0 S ¢ S 6, 6 > 0, ¢ arbitrary, 


 Gele, 4) _ 
tr 


The proof that the identity (1) holds is complete. 


0. 


4. The oscillation of G(r, ¢) on arcs of geodesic circles. If Gy and G,, denote 
the maximum and minimum, respectively, of G(r, ¢) on an are of length / of a 
geodesic circle, the ratio Gy/G,, will in general vary with J, the arc, and the 
geodesic circle. In the case of constant curvature, G depends only on r, so 
that the ratio always has the value 1. It will be shown that if an additional 
hypothesis, which is automatically fulfilled in a large number of cases, is made, 
the ratio Gy/G,, is less than a constant which depends only on / and not on the 
particular geodesic circle or the are of it chosen. 

The additional hypothesis concerns the curvature and is as follows. 


(H.3) There exists a constant A such that 


LOR cA, r>0, 
G ae 


where the geodesic polar coérdinate system under consideration ts arbitrary. 


If s is the are length on the geodesic circle passing through the point (r, ¢), 
r > 0, it follows from (3.1) that dé/ds = G', and thus 


1 aK _ dK 
G ad ds 


? 


where dK /ds is the directional derivative of K along the geodesic circle. Thus 
if |dK/ds| is uniformly bounded by a constant which is independent of the 
geodesic polar coérdinate system used, the condition (H.3) is satisfied. 

The first step in the derivation of the desired result is the proof that there 
exists a constant B, independent of the polar codrdinate system chosen, such 


that 


IIA 

~ 
~ 
\IV 





_— 





trary, 


lenote 
l of a 
id the 
r, 80 
tional 
made, 
yn. the 


‘> 0, 


Thus 
of the 


there 
_ such 


lV 





GEODESIC TYPES ON SURFACES OF NEGATIVE CURVATURE 235 


It follows from the identity (I) of §3 that 


1 aG 1 [ 1 [ , . } 
——=- _——— Kult, Gt ds, 
r ad G(r, ¢) P G(s, 7{ . of >) ( >) dt s 


and hence, with the aid of (H.3), we obtain 


1 aG A [ 1 ‘f , } 
“et a ‘| Git dt >ds. 
(4.1) G36 = G(r, ¢) Jo G(s, 4) |e (6) dt ra 


In order to study the function on the right of inequality (4.1) we make con- 
siderable use of the fact that G(r, ¢) satisfies (3.3). It is somewhat simpler to 
consider the ordinary differential equation obtained from (3.3) by holding @ 
constant. 

Let y(z) be that solution of the differential equation 


(4.2) y” = f(z)-y, 


jz) continuous and 0 < c’ < f(z) S$ d’ in 0 < x < +, determined by the 
initial conditions 

(4.3) y0) =0, y(0) =1. 

Then y(x) is of class C’, 0 S x < +, and since f(z) is positive, it follows that 
y(z) > Oand y’(z) > 0,0 <2 < +. With the aid of well known comparison 
methods, we have 

a + e cr - y' (zx) “ e* + ee 


< my, Se te 0. 
ect — e-cz ~ y(x) — et — edz «t> 


(4.4) 


Since the function on the left is decreasing for z > 0 and approaches c as limit 
when x becomes positively infinite, it follows that 


, 
(4.5) y (2) 4 2>¢é 
y(z) 
Since the function on the right is decreasing, it follows that 
, d —d 
(4.6) ¥@) < ge te” z21 
y(x) et — e-4 
Now consider the function 
i ae [ y'dz, z> 0, 
y® Jo 


where y(x) is again the solution of (4.2) satisfying the initial conditions (4.3). 
Then v(x) is of class C” and positive for z > 0. 


LemMa 4.1. There exist positive constants d, and dz , depending only on c and 
d, such that 


(4.7) d, < u(t) < d2, ad 


IV 








236 GUSTAV A. HEDLUND 
By comparison we have 
(4.8) "+g 4 —., 


and thus 


8a° 1 a -_ a 3 
0<a= (a—esp J, “— dx = v(1) 


8° 1 Pa _ oo 3 
< mat | _) dr =o 
~ (ee — = I, ( 2d ) ae 


Let d; be the smaller of the two positive constants 3c; and 


Pg a e -d 
6d(e? — e-)’ 


1 


Let d, be the greater of 2c2 and $c The constants d; and dz depend only on 


ec and d. 
By differentiation it follows that v(x) is a solution of the differential equation 


3y’ 
7] 


v+v = |, z>0. 
At z = 1 the inequalities of the lemma are satisfied. Suppose that x) > 1 
and v(x) = d;. We can suppose that 2» is the least of such values and under 
this assumption v’(z») S 0, since v(1) 2 « > d,. But with the aid of (4.6), 
3y' (x e” 7 
y (xo) > 1 — ay-3¢ 


— ¢ 
0, 
y(2o) e@ + ed > 


v’ (x0) =l]- v(x) 


IV 


and from this contradiction we infer that v(a) # di, 2% 2 1. Since v(z) 
continuous and v(1) = c, > d,, it follows that v(x) > d,, x 2 1. 

Suppose that x > 1 and v(z) = dz. Assuming that 2x is the least of such 
values, since v(1) = cs < dz, we have v’(z) 2 0. But from (4.5) 


3y’ (x0) 


v(x) = 1 — v(a0) (te) 


<= 1 — de3c < @ 


Thus v(2o) # dz, 2) > 1. Since v(x) is continuous and v(1) S cz < de, it follows 
that v(x) < d., x 2 1, and the proof of the lemma is complete. 
Let the function z(x) be defined as follows: 


( . ft « 
1 & = 0, 
(x) Fa) [ y(n) dn, r> 


| 2(0) = 0, 


where y(x) is again the solution of (4.2) satisfying the initial conditions (4.3). 
The function z(x) is continuous for z = 0. The continuity for z > 0 follows 











nly on 


juation 
z>0. 


fo > ] 
under 
f (4.6), 


v(z) is 


»f such 


follows 


z > 0, 


. (43). 


follows 





GEODESIC TYPES ON SURFACES OF NEGATIVE CURVATURE 237 


from the continuity of y(x), x > 0, and the fact that y(x) > 0,2 > 0. Toshow 
the continuity at « = 0, we make use of (4.8) and obtain 


4c” z (e”" - e any 
0<2r) Ss [ ln, 2> 9, 
—_ (¢ cz "ied 0 Rd3 an 


whence 


4c’ ( dz —" ay i 


0 < 2(z) 
( (e* a e*)? Rd? 


IIA 


t: z > @. 


Since the function on the right approaches zero as x approaches zero, lim z(z) 


z—0 
= 0, and z(x) is continuous at x = 0. It is evident that z(z) is of class C” 
forx > 0. 
Let 
1 z 
u(r) = 2() dé, z> 0, 
y(x) Jo 
where y(2) and z(x) are as defined above. Then w(z) is of class C” and positive 
forx > 0. 


LemMA 4.2. There exist positive constants e; and e2 , depending only on c and d, 
such that 


IV 


€) < w(x) < es, z 


The method of proof is similar to that used in proving Lemma 4.1, and only 
the proof of the second inequality will be given. It follows from (4.8) and the 
definition of w(x) that 


2c [ (2c)° if i De 
9 < < \ 4 == ho 8 
7 (1) ~ GF mm EF Jy (es -_ e~et)? \J0 2d dy, ds u>O0 


Let e¢2 be the larger of the two positive numbers 2h, and 2d2/c, where dz is that 
of Lemma 4.1. Then w(1) S he < eg. Suppose that at rz» > 1, w(x) = ee. 
It can be assumed that 2 is the least of such values and thus w’(z) = 0. But 
it is readily shown that w(z) is a solution of the differential equation 


, zx 
, , ] : 
w +wt = | y(n) dn = v(2), x>0, 
y Yo 
and thus, with the aid of (4.5) and Lemma 4.1, we get 
y’ (x0) 
w' (xo) = —w(2o) * + v(x) S —exe + de < 0. 
y(o) 
From this contradiction we infer that we cannot have w(x) = e.,z2 2 1. But 
since w(1) < e2 and w(z) is continuous, it follows that w(x) <e,2 21. The 


constant dz depends only on c and d, so that the same is true of ee. 











238 GUSTAV A. HEDLUND 


LemMa 4.3. There exists a constant B, determined by A, c and d, such that 


F.. < B, r2ti 
? og 
For any fixed ¢, G(r, ¢) is a solution of the differential equation 
(4.9) TY _ _K(r, ¢)-y, r>0, 
dr? 


and satisfies the initial conditions 
(4.10) G(O, ¢) = 0, G0, @) = 1. 


Since 0 < c’ < —K(r, ¢) S d’, the equation (4.9) satisfies the conditions on 
(4.2) and the initial conditions (4.10) are the same as those of (4.3). If we 
replace y(z) by G(r, ¢), w(x) becomes 


1 [ 1 [ , ) 
G'(t, @) dt> ds, 
Gr, 6) bo Ge, ail U, o)dt rc 


and it follows from Lemma 4.2 that this function does not exceed e2 when r = 1. 
If we combine this with (4.1), there results 


IA 
IV 


If we set B = Ae, , the stated lemma is proved. 
We come to the proof of the desired theorem. 


THEOREM 4.1. Given l > 0, there exists a constant D, depending only on B 
and l, such that if any point P in V is used as center of geodesic polar coérdinates, 
with resulting quadratic form 


2 2 2 2 
ds = dr + G(r, ¢)d¢, 
if o is an arc of length | of any geodesic circle of radius greater than unity of this 
system, and Gu(c) and G,,(c) are the maximum and minimum, respectively, of 
G(r, ¢) on a, then 
Gu 


Gn 


< D. 





Case l. Bl < 1. 

Let s be the are length on ¢ measured from one end point. Then on o, G(r,¢) 
is a function of class C’ of s, and if sy and s,, denote the values of s where G 
assumes the values Gy and G,, , respectively, we have 


dG 1 aG 
Gu — Gn, = | — Su — S») = _— Su — Sm); 
M (#). (su Ss ) ({ ey M 








IIV 


iV 
P 


ns on 


If we 


IV 


IV 


on B 


inates, 


of this 
ely, of 


G(r, 9) 
here G 





GEODESIC TYPES ON SURFACES OF NEGATIVE CURVATURE 239 


where P is a point of ¢. With the aid of Lemma 4.3, we obtain 


. : 1 aG . : 
Gu — Gn S G'se _Ge:|su — 8n| S BlGw. 
It follows that 
iM 1 


. —j ° 
and if we let D = (1 — Bl)”, the stated result is proved. 
Case Il. Bl = 1. 
Let 1, , le, --- , l, be a set of positive numbers such that > l; = land Bl; < 1 
t=1 
(i= 1,2,---,mn). Let the are o be divided into a set of n successive closed ares 
Ii, I2,---,1n, of lengths 4,, 2, ---,l,, respectively. Let <Gy be the max- 
imum of G on J; and let ,G,, be the minimum on this same are. There exists 
an integer j7, 1 S 7 S n, such that Gy = Gu, and an integer k, 1 S k Sn, 
such that G, = .Gm. We can assume that j S k, for the case 7 > k can be 
reduced to this case by taking the ares in the opposite order on ¢. Then 
Gu = jm a jGu iu1Gu GM iGm 4-1G'm 
on EE = Ee << 
G. t+ Gm {Gm i41Gm cm i41G :Gu 
where there is just one factor on the right of k = 7. Since the intervals J ;,, 
and J ;,,4, are closed and adjacent, they have a point in common and 


G m 


jiu 


i+? 


lA 


1. 


Applying this and using the fact that Case I applies to each 7; (¢ = 1, 2, --- , n), 
we see that 


Gu < 1 ; l pi vaies l 
Gn,  1— Bl 1 — Blas i- 
If we let 
1 l l 
> ‘ err or 
d i_- Ri_ \ =. ” 


the stated result is proved. 


5. The approximation of horocycles by geodesic circles. Let P be a point 
of ¥, A a point of U’, and C(P, A) the horocycle determined by P and A. (For 
the definition, existence and properties of horocycles, cf. Grant.) Let Q be a 
pointof C(P, A). It has been shown by Grant (Theorem 7.2) that if C; , C2, --- 
isa sequence of geodesic circles with centers approaching A, and if C, has on it 
apoint P, such that lim P, = P, then C, has on it a point Q, such that lim 


n->2 n--o 


Q. = Q. A further property of this approximating process will be derived by 











240 USTAV A. HEDLUND 


showing that the length of the smaller of the arcs P,Q, of C, remains bounded 
as n becomes infinite. To that end we first prove the following lemma. 


Lemma 5.1. Given a geodesic circle C, there exists a constant b such that the 
length of the arc which lies within C of any geodesic circle with center exterior to C 
does not exceed b. 


If geodesic polar coérdinates are set up with the point P(u, v) of W as center, 
the function G(r, ¢) in the resulting quadratic form 


dr® + G(r, ) d¢’ 


will depend on the choice of P. We indicate this by writing G(r, ¢, u, v) instead 
of G(r, ¢). By application of the theorems concerning the dependence of solu- 
tions of differential equations on the initial conditions, it can be shown that 
G(r, ¢, u, v) depends continuously on the coérdinates of P and the choice of 
initial direction. It follows that if P is restricted to lying on C and 7 is a positive 
constant, there exists a constant h(7) such that 


(5.1) G(r, %, u, v) S h(*), (u, v) on C, Os 7 SFt 


Let 7 be the center and ¢ the radius of C. Let S, any point of ¥ exterior to (C, 
be the center of a geodesic circle C’ which cuts across C in points V and W. 
The segment VW of C’ which lies within and on C intersects the geodesic deter- 
mined by T and S in one point X (for the points V and W lie on opposite sides 
of the geodesic (cf. Grant, Theorem 4.4) and the geodesic intersects C’ in just 
two points) and we consider the segment XV thus determined. Let M be the 
point in which the geodesic segment joining 7 and S intersects C. Then M is 
interior to C’, for the radius of C’ must be greater than the length of SM if C’ 
is to intersect C in two points. If we set up geodesic polar coérdinates with 
M as center, XV is given by a function of the form r = r(¢), &: So S &, 
where r(¢) is an increasing or decreasing function of ¢ (this is easily proved), 
and we can assume that 0 < ¢ < ¢: S 2x. Furthermore, since every point 
of XV lies within or on C, its distance from M cannot exceed 2¢. Let h(2t) 
be the constant as determined in (5.1). Then the length of XV is given by 


" /{ dr? m 
L = [ V (52) + G (r, o) dd, 


where G(r, ¢) is G(r, ¢, u, v) when (u, v) are the codrdinates of M. It follows 


that 
%21 | dr 
< al 
L s[ ||| + a0, ) Jae, 


and since dr/d@ does not change sign, with the aid of (5.1) we obtain 


L &S | r(g2) — r(pi) | + h(2t)(o2 — 1) S 2t + 2rh(2t). 








to C, 
id W. 
leter- 
sides 
n just 
ye the 
. M is 
if C’ 
; with 
Sh, 
oved), 
point 
- h(2t) 


yy 


‘ollows 





GEODESIC TYPES ON SURFACES OF NEGATIVE CURVATURE 241 


A similar result applies to the are XW of C’ and thus the length of the arc 
VW of C’ cannot exceed 4¢ + 4h(2t). If we set b equal to this number, the 
stated lemma is proved. 

The extended approximation theorem is as follows. 


THEOREM 5.1. Let P be a point of V, A a point of U, C(P, A) the horocycle 
determined by P and A, and Q an arbitrary point of C(P, A). IfC,(n = 1,2,---) 
is a sequence of geodesic circles with centers approaching A and C,, has on it a point 
P, (n = 1, 2, --- ) such that lim P, = P, then there exist a constant L and a 


point Q, on C, (n = 1, 2, --- ) such that the point Q, lies on the arc of C,, of length 
L with midpoint at P, and lim Q, = Q. 


n--o 


The part of the theorem which states that Q, exists on C, such that lim 


Q, = Q has been proved by Grant (Grant, Theorem 7.2), so it remains to prove 
the existence of the constant L with the stated property. 

Let C be a geodesic circle with center P and containing Q in its interior. 
Then there exists an N such that for n 2 N, the center of C, is exterior to C 
and P, and Q, are interior to C. Let LZ be so chosen that the are of length 
Lof C, (n = 1, 2, --- , N — 1) with midpoint P, contains Q, and also such that 
L > 2b, where 6 is the constant determined by C as in Lemma 5.1. Then the 
stated theorem holds for n = 1, 2,---,N — 1. Forn 2 N, P, and Q, lie 
within C and the center of C, is exterior to C. It follows from Lemma 5.1 
that the length of the arc P,Q, of C, lying within C cannot exceed b. But since 
L > 2b, the are of C, of length ZL with midpoint at P, must contain Q,. The 
proof of the theorem is complete. 


6. Two-dimensional manifolds of negative curvature and their classification. 
Let F be a Fuchsian group with principal circle U. We now impose the follow- 
ing condition on A(u, v). 


(H.4) A(u, v) is invariant under the group F. 


The condition (H.4) implies the invariance of the metric (2.1), and hence the 
invariance of length, angle, area and curvature as determined by (2.1), as well 
as the invariance of the geodesics. Two points P; and P, of W are congruent 
if there is a transformation of the group taking one into the other. Two ele- 
ments ¢€;(u: , % , 4:1) and é2(u2, v2, 42) are congruent if P,(ui , 1.) and Pe(u2, v2) 
are congruent and there is a transformation of the group F taking P; into P, 
such that the direction 6, at P; is transformed into the direction 62 at P» . 

If congruent points are considered identical, there is defined a two-dimensional 
manifold M of negative curvature. The topological properties of M are deter- 
mined by the group F. If F contains elliptic transformations, such a trans- 
formation having necessarily one fixed point in YW and the other fixed point 
exterior to U, M has one or more singularities of the nature of cusps. That is, 
the sum of the angles at such a point is less than 2r. 











242 GUSTAV A. HEDLUND 


We classify these manifolds according to the properties of F as follows. (For 
definitions of Fuchsian groups of the first and second kind, ef. Ford, p. 68.) 

Class I. F is of the first kind and has a finite set of generators. 

Class Il. F is of the first kind and has an infinite set of generators. 

Class III. F is of the second kind. 

Manifolds of class I may or may not be closed. The manifold will be closed 
if the fundamental region Ry of F has no boundary points on U. These mani- 
folds include, in particular, all closed, orientable surfaces of negative curvature 
and such that the coefficients of the first fundamental form are of class C’. 
For, from the Gauss-Bonnet formula, such a surface is necessarily of genus 
greater than one. The universal covering surface of such a surface can be 
mapped conformally into the interior of U and the resulting quadratic form 
will be given by (2.1) with A invariant under a Fuchsian group F, the funda- 
mental region of which has no boundary points on U. Since the surface is 
closed, it is easily seen that the conditions (H.1), (H.2) and (H.3) are also satis- 
fied. 

If, on the other hand, M is of class I but the fundamental region Ry of F 
has one or more boundary points on U, M is not closed. In this case F contains 
a parabolic transformation and M contains a region which is of the topological 
type of a half-cylinder, of infinite length, and such that a simple closed curve 
can sweep out this region with its length approaching zero. This is an example 
of what Hadarnard terms a “‘nappe non évasée” (cf. Hadamard [1], p. 35). We 
will term such a region a parabolic opening. 

It is easily shown that the area of a manifold of class I is finite. The area 
of such a manifold is the area of the fundamental region of the group F defining 
the manifold. 

In the case of manifolds of class III, the fundamental region Ry contains 
on its boundary an interval of U (ef. Ford [1], Theorem 14, p. 73). If the 
end points of such an interval are end points of paired sides of Ro , the manifold 
M contains a region which is of the topological type of a half-cylinder, of in- 
finite length and such that the length of any closed curve which sweeps out this 
region becomes infinite. This is an example of a “nappe évasée” of Hadamard 
(ef. Hadamard [1], p. 35). We will term such a region a hyperbolic opening. 

A manifold of class III is necessarily of infinite area. 


7. Stability in the sense of Poisson and transitivity. A motion of a dynamical 
system is stable in the sense of Poisson if the motion returns infinitely often to 
any given neighborhood of its initial state (cf. Poincaré [1], p. 141). A motion 
of a dynamical system is transitive if the points of the motion are everywhere 
dense in the phase space. In the case under consideration, namely, that of the 
geodesics on the manifold M of §6, it will be desirable to formulate these defini- 
tions in different, but equivalent terms. 

A point e of E determines a point of V and a direction at that point. Thus 
each element determines a directed geodesic ray having this element as its initial 





ofa -«- — Aa 


8 


or 








GEODESIC TYPES ON SURFACES OF NEGATIVE CURVATURE 243 


element. Let gr be this geodesic ray, P its initial point and A its point at 
infinity. The point e of E, or element, will be said to be stable (in the sense of 
Poisson) if there exists a sequence of points P; , P2, --- on gr, with lim P, = A, 


n--o 
such that if e, is the element of gr at P, , there is an element e, congruent to e, 
such that lim e, = e. The element opposite e is the element at the same point 


as e but oppositely directed. The element e will be said to be completely stable 
if both e and its opposite are stable. 

If M is closed, it follows from a well known recurrence theorem of Poincaré 
that almost all points of EF are completely stable. It has been pointed out by 
E. Hopf ((2], p. 712) that the same result holds if the space of motions under 
consideration is of finite volume. Since all manifolds of class I have finite 
area, the volume of the corresponding space of motions is finite and we have the 
following result. 


THEoREM 7.1. If M is a manifold of class I, almost all points of E are com- 
pletely stable. 


The point e of E will be said to be unstable (E. Hopf, fliehende) if, gr being 
the directed geodesic ray determined by e, and P; , P2, --- being any sequence 
of points of gr such that lim P, = A, A the point at infinity of gr, then the points 


P,, Ps, +--+, together with all congruent points, have no cluster point in W. 
This is equivalent to the condition that the geodesic ray of M determined by e 
shall eventually pass out of and remain outside of any finite region of M. An 
element which is not stable is not necessarily unstable. In the case of closed 
manifolds of class I there are no unstable elements, but not all elemerits are 
stable. However, it has been shown by E. Hopf ({2], Theorem 1) that for a 
class of dynamical systems which include all those under consideration here, 
almost all points of E are either stable or unstable. 

The element e will be said to be transitive if the elements on the geodesic ray 
gr determined by e and on the set of geodesic rays congruent to gr form a set 
which is everywhere dense in E. It is known that there exist transitive ele- 
ments on manifolds of class I or class II (ef. Hedlund [1]). There are no transi- 
tive elements on a manifold of class III. 

The property that almost all the elements be transitive can be shown to be 
equivalent to a simple property of the flow defined by the geodesics (cf. Tuller 
(1), p. 92). To that end, if S is a set in EZ, let S, denote the set of elements on 
the geodesic rays determined by the points of S. Let S* be the set consisting 
of S together with all congruent sets. It is easily shown that the sets (S,)* 
and (S*), are identical and this set will be denoted by _. 


DeFINITION 7.1. The manifold M has Property B if, S being any measurable 
set of E of positive measure, the set S* is everywhere dense in E. 


THEOREM 7.2. A necessary and sufficient condition that almost all elements be 
transitive is that Property B hold. 











244 GUSTAV A. HEDLUND 


If almost all points of E are transitive and S is a set of positive measure, 
S must contain a transitive element. But then the set S? contains all the 
elements on the ray determined by this element, as well as on all congruent 
rays, and thus the set S; must be everywhere dense in E. It follows that 
Property B holds. 

Assuming that Property B holds, let 0, , O2, --- be a sequence of open sets 


in E such that E C >> O, for all N, and such that the maximum Euclidean 


n=N 

diameter of O, approaches zero as n becomes infinite. Let E, be the set of 
elements such that if e C E,, the set e* has an element in O,. Then the set 
{Ce(E,)}? contains no element of O,, and since Property B holds, it follows 
that Cz(E,) must be a zero set. This implies that the set Il E,, constitutes 
n=1 
almost all points of Z. But if an element belongs to all Z,, it is evidently 
transitive and thus almost all points of Z are transitive. The proof of the 
theorem is complete. 


8. The measure of the transitive elements on manifolds of classI. Let M 
be a manifold of class I and let S’ be a set of positive measure of FE. According 
to Theorem 7.1, almost all points of F are completely stable, so that, except for 
a set of measure zero, the points of S’ are completely stable. Let S be the set 
of completely stable points of S’. 

Let S(u, v), u? + v° < 1, denote the set of values of ¢ such that (wu, v, ¢) be- 
longs to S. Then, according to a well known theorem of point set theory, 
one of the sets S(uo , v) is a measurable linear set of positive measure (in the 
Euclidean sense). Furthermore, the set S(uo, vo) must contain a value ¢ 
at which the linear metric density is unity. Let m be the element (uo , 0% , do). 
This point can and will be chosen so that 0 < ¢o < 2z. 

The element m determines a directed geodesic ray with initial point 
M,(uo , %). Let B be the point at infinity of this geodesic ray and let this ray be 
denoted by gr(M,, B). Since the element m is stable, there exists on gr(Mo , B) 
a sequence of points M, (n = 1, 2, --- ) such that lim M, = B, and if m, is 


n--o 


the element of gr(My), B) at M,,, there is an element e, congruent to m, such 
that lime, = m. If m, is the element opposite m, and @, is the element opposite 


n-?o 


e, , it follows that lim @, = m, where m is the element opposite m. Let A be 


no 


the point at infinity of the geodesic ray determined by 7. Then if T,, denotes 
the transformation of the group F taking m, into e, , we have T,,(i%n) = én, 
and since lim é, = mi, it follows that lim 7,(M>) = A. 


Let C’, be the geodesic circle with center My and passing through M,, and 
let C,, be the geodesic circle T,(C,). Then, as n becomes infinite, the center 
T..(M>) of C, approaches A and C, has on it a point P, , the point bearing 
(and @,), such that lim P, = My). If Q is an arbitrary point of the horocycle 


n--o2o 








and 
nter 


1g en 
eycle 





GEODESIC TYPES ON SURFACES OF NEGATIVE CURVATURE 245 


C(M,, A), Theorem 5.1 states that there exist a constant Z and sequence of 
points Q, (n = 1, 2, --- ) such that lim Q, = Q and Q, lies on the are of C, 


of length L with midpoint P,. Thus, if we apply the transformation 7',’, if 
l, is the are of C., of length L and with midpoint M,, J, has on it a point Q’, 
such that lim T,(Q:) = Q. 


If the point Mp is joined to the points of 1, by directed geodesic segments 
with initial points at My, the values of ¢ of the initial elements of these seg- 
ments form a set a, S @ S Bn, where an < gd < B,. Of these values of ¢, 
let those which belong to S(u, vo) be denoted by EF}, and those which do not by 
F.. Since the length of J, is L and does not change with n, lim | a, — 8, | = 0. 


no 


Since the density of the set S(uo , vo) is unity at do , it follows that 


uF, 
=e Heath 
where » denotes linear measure. 

Let E, denote the set of points of J, determined by the geodesic segments 
for which the ¢ of the initial element is in E%, and let F,, denote the remaining 
points of l,. Then if the measure of sets on I, in terms of the arc length on I, 
is denoted by v, we have 


me ds [ 

vE,, _ es de do = zs G(r, >) d¢, 
= ds I. 

vF, = rs do do = G(r, ¢) dg, 


where the integrals are Lebesgue integrals and G(r, ¢) is that determined by 
setting up geodesic polar coérdinates with Mp» as center. If Gj, denotes the 
maximum and G}, the minimum of G(r, ¢) on l, , it follows that 


7 * G( (r, >) do — Guu 


(8.1) 2G ES . 
ae 3 Gir,¢)dg Own 





For n sufficiently large, the radius of the geodesic circle of which I, is a segment 


is greater than unity, and it follows from Theorem 4.1 that 
Gu < B. 
Gn 


If we combine this with (8.1), there results 


+S 


(8.2) lim vEn < Blim ul ~ = 0. 


S 
nx 7, = no Ul», 











246 GUSTAV A. HEDLUND 


Since vE, + vF, = L, it follows from (8.2) that 
(8.3) lim vE, = L. 
ae 
This implies that the length of the segment of maximum length of 1, which 
does not contain a point of Z,, must approach zero as n becomes infinite. Conse- 
quently, there exists a sequence of points Qi (n = 1,2,---), Qn © En, such 
, . . 
that the length of the segment Q,Q,, of |, approaches zero as n becomes infinite, 
. . , ” 
But then the geodesic distance between Q, and Q, must also approach zero, 
° ° ray / . . ryy 
and since lim 7',(Q,) = Q, it follows that lim T’,.(Q,) = @. 


n->*o n-*o 


Let m, be the element at Q” of the directed geodesic ray with initial point 

‘ ” a e,e “y , 

at M, and passing through Q,. Then by definition of S;, m,. and T,(m,) 
belong to S*. Since lim 7,,(Mo) = A and lim 7,,(Q’.) = Q, it follows that lim 


T,(m,,) = q, where q is the element at Q opposite to the initial element of 
gr(Q, A). It has been shown by Grant (Grant, Theorem 6.3) that gr(Q, A) 
is orthogonal to C(M, , A) at Q, so that if Cg(M>y, A) denotes the right horocycle 
determined by M, and A, the element q is obtained by rotating the element of 
Cr(M,, A) at Q through the angle }z. 

Also it has been shown by Grant (Grant, Theorem 2.4’) that since the element 
m is completely stable, and thus 7 is stable, Cg(Mo , A) is transitive; that is, 
the elements of Cy(M,, A) together with congruent elements form a set which 
is everywhere dense in Z. But then the elements obtained by rotating each 
of the elements of Cg(M, , A) through the angle 42 must form a set such that it, 
together with the congruent sets, forms a set which is everywhere dense in E. 
Thus if EZ) is an arbitrary open set of E, the point Q can be so chosen that either 
q or a congruent element lies in EZ). It has been shown that there exists a 
sequence of elements 7',(m’,) of S; such that lim 7,(m:) = q. But then, 


n-?*o 
. ys a al y* : 
since S* includes all elements congruent to the elements 7',(m.,), S; must contain 
P " *. . 

elements in Ey). The set S, is everywhere dense in E, and Property B holds. 
With the aid of Theorem 7.2, the proof of the following theorem is complete. 

THEOREM 8.1. Almost all the elements on a manifold of class I are transitive. 

A directed geodesic is transitive if the elements on the geodesic and on all 
congruent geodesics together form a set which is everywhere dense in E£. If 
one element of the geodesic is transitive, the geodesic is transitive. A set of 
geodesics will be said to constitute almost all the geodesics if the elements on these 
and congruent geodesics constitute almost all points of #. The preceding 
theorem can now be restated as follows. 

THEOREM 8.2. Almost all geodesics on a manifold of class I are transitive. 


In view of the remarks in §6 concerning closed surfaces of negative curvature, 
we can state the following result. 
° ° 8 
Coro.iary 8.2’. Almost all geodesics on a closed orientable surface of class C 
and of negative curvature are transitive. 








nent 
t is, 
hich 
each 
ut it, 
n £. 
ther 
sts a 
hen, 


tain 
olds. 
ate. 
ve. 
n all 
If 
et of 
these 
ding 





GEODESIC TYPES ON SURFACES OF NEGATIVE CURVATURE 247 


For if the functions defining the surface are of class C’, the coefficients of the 
first fundamental quadratic form are of class C’, and the surface is included 
among the manifolds of class I. 


CoroLuary 8.2”. Almost all the geodesics on a closed non-orientable surface 
of class C° and of negative curvature are transitive. 

For such a surface has a closed orientable covering surface of multiplicity 
two and a transitive geodesic on the covering surface has as correspondent a 
transitive geodesic in the non-orientable surface. The statement of the corollary 
is thus a consequence of Corollary 8.2’. 


9. The measure of the unstable elements on manifolds of class III. Let M 
be a manifold of class III, that is, one for which the defining Fuchsian group F 
is of the second kind. Then the fundamental region Ry abuts on the circle U 
along one or more arcs, and the region R, which consists of Ry together with its 
reflection in l’, contains the interior points of these ares in its interior (ef. 
Ford [1], p. 74). Let a denote the set of interior points of these ares. 

Suppose that A is a point of the set @ and e is an element determining a 
geodesic ray with point at infinity A. We show that e is unstable. Recall 
that e has been defined as unstable if, P; , P2, --- being any sequence of points 
of the geodesic ray determined by e such that lim P, = A, these points together 


n--o 
with all congruent points have no cluster point in V. Since A is an interior 
point of R, for n sufficiently large, the point P, lies in Ry. But then (Ford [1], 
Theorem 9, p. 71) P, is nearer the center of U’ than any point congruent to it. 
Since lim P,, = A, and any closed region lying within U’ is covered by a finite 


n->o 
number of transforms of Ry (Ford [1], p. 70), it follows that points congruent 
to the set P, (n = 1, 2, --- ) cannot have a cluster point in V. The following 
lemma has been proved. 


Lemma 9.1. Jf A is a point of U belonging to the set a, and e is an element 
determining a geodesic ray with point at infinity A, e is unstable. 


THEOREM 9.1. Almost all elements on manifolds of class III are unstable. 


Suppose that the statement of the theorem is not true. Then it follows 
from a theorem of Hopf (cf. E. Hopf [2], Theorem 1) that there must be a set S 
of stable elements of E such that S is of positive measure. Let the element 
m(uo , ¥ , ¢o) be chosen as in the proof of Theorem 8.1, where the present set S 
takes the place of the set S used in the proof of Theorem 8.1. Let Mo again 
be the point (uo , v%) and let A be the point at infinity of the geodesic ray with 
initial element m, the element opposite m. If q is an element obtained by ro- 
tating an arbitrary element of Cg(M>), A) through the angle 32, the proof of 
Theorem 8.1 shows that in the case under consideration the set S> contains a 
sequence of elements g, (n = 1, 2, --- ) such that lim g, = q. 


n-—>o 


Now let D # A be a point,belonging to the set a. The geodesic with initial 











248 GUSTAV A. HEDLUND 


point at infinity A and terminal point at infinity D cuts across Cg(M,, A) 
orthogonally at a point Q. Choosing this as the point Q above, if D, denotes 
the point at infinity of the geodesic ray with initial element q,, , since lim q, = 4, 


n--2 


we see that lim D, = D. But then for n sufficiently large, D, belongs to the 


nro 
set a, and it follows from Lemma 9.1 that q, , » sufficiently large, must be un- 
stable. It is easily shown that if an element is stable, the same is true of all the 
elements on the geodesic ray determined by this element. Since it was assumed 
that all elements of S were stable, the same would be true of all elements in S*, 
But if g, is unstable, it cannot be stable, and from this contradiction we infer 
the truth of the theorem. 


BIBLIOGRAPHY 
L. R. Forp. 
1. Automorphic Functions, New York, 1929. 
J. HADAMARD. 
1. Les surfaces a courbures opposées et leur lignes géodésiques, Journal de Mathématiques 
pures et appliquées, (5), vol. 4(1898), pp. 27-73. 
G. A. HEDLUND. 
1. Two-dimensional manifolds and transitivity, Annals of Mathematics, vol. 37(1936), 
pp. 534-542. 
2. Fuchsian groups and transitive horocycles, this Journal, vol. 2(1936), pp. 530-542. 
E. Hopr. 
1. Fuchsian groups and ergodic theory, Transactions of the American Mathematical 
Society, vol. 39(1936), p. 299. 
2. Zwei Sdtze iiber den wahrscheinlichen Verlauf der Bewegungen dynamischer Systeme, 
Mathematische Annalen, vol. 103(1930), pp. 710-719. 
Morse. 
1. Instability and transitivity, Journal de Mathématiques pures et appliquées, (9), 
vol. 14(1935), pp. 49-71. 
A. TULLER. 
1. The measure of the transitive geodesics on certain three-dimensional manifolds, this 
Journal, vol. 4(1938), pp. 78-94. 


M. 


Bryn Mawr COouuece. 











, A) 
\Otes 


» the 


 un- 
| the 
med 
1 SF. 


infer 


iques 


| 936), 
42. 

atical 
steme, 


, 9), 


, this 








A PROOF THAT EVERY UNIFORMLY CONVEX SPACE IS REFLEXIVE 
By B. J. Perris 


The purpose of the present note is to communicate an independent proof of a 
result of Milman’s [6]' to the effect that every uniformly convex space is neces- 
sarily reflexive. Our proof is quite different from Milman’s, being based on the 
use of bounded additive measure functions rather than on that of transfinite 
closure for closed convex sets. 

Let ¥ = [z] be a Banach space, ¥ = [y] its adjoint, and X = [F] the adjoint 
of ¥ The space % is said to be reflexive’ if for each Fo eX there is an 2% €% 
such that Fo(y) = y(zo) holds for all y ¢«¥. The concept of ¥ being a uniformly 
conver space, a concept of Clarkson’s [2], may be defined in the following 
fashion: given « > 0 there is af, > 0 with the property that 


if x,y €X have || x || = ||y || = land@f || x — y|| = «, then 
lt+y|| S2-%. 


The theorem may now be stated as follows. 


(*) 


THEOREM (Milman). If ¥ is isomorphic to a uniformly convex space, then X 
is reflexive. 
We first establish two lemmas. 


Lemma 1.° If X is uniformly convex, then given yo €X with |\ yo || # 0, there 


exists a unique Xo €X satisfying the conditions || x || = 1 and ve Ze) = || vo I. 
Moreover, given « > 0, there exists a 6, > 0 such that if x and y in ¥ and y in® 
satisfy the conditions || x || = 1, ||y|| $1, ||4 — y|| 2 « and y(z) = || ||, 


then y(y) S (1 — 8.) |y |. 


In proving the first part it is clearly sufficient to consider only the case 


Yy | = 1. By definition of || yo || there then exists a sequence {z,} with 
t = 1land1 = y(rt,) > 1 —n". Tosee that {z,} is a Cauchy sequence 


This note, as first submitted to the editors, was part of the manuscript of a paper en- 
titled On differentiation in Banach spaces which was presented to the American Mathemat- 
ical Society, September 6, 1938, and abstracted in the Bull. Amer. Math. Soc., vol. 44, no. 
7 (July, 1938), p. 486. The manuscript was received August 30, 1938, before receipt of 
Milman’s article in this country. At the suggestion of the referee the present paper is 
being published separately. It was received in the above form April 8, 1939. 

' Numbers in brackets refer to the list of references. 

*Such spaces were introduced by Hahn [4] under the name of regular. The present 
term reflexive is due to Lorch. 

*The first statement contained in Lemma 1 was discovered independently by J. A. 
Clarkson and E. J. McShane in 1936. I am grateful to them for permission to include it 
here, 


249 











250 B. J. PETTIS 


let « > 0 be given, let ¢, correspond to ¢ according to (*), and choose n, > 2/f,. 
Then m 2 n., n = n, imply that 


2 
Yo(Lm + In) = Yo( Lm) + vo(n) 7 2 _ n. > 2 = $e 


and hence that || 7, + 2,|| > 2—¢. Since || zm» || = || 2, {|| = 1, this last 
inequality yields || zm — 2n|| < ¢€ whenever m, n 2 n,, since otherwise (*) 
is contradicted. 
Let z = limz,. Then 
n 


|| zo || = lim || z, || = 1, vo(to) = lim yo(2n) = || Yo |). 
n n 
If there were another point x; such that || 7 || = 1 and yo(r1) = 1, and if 
\|a2, — 2 || = p > O, then from (*) it would follow that || zo + 7 || $ 


2 — ¢, < 2 and hence 
2 > | vo(to + 21) | = v0(Xo) + yo(xi) = 2, 


an impossibility. This establishes the first half of the lemma. 

To prove the second part let any « > 0 be given and set 7 = min (3, 3¢). 
It follows that if ¢, corresponds to 7 according to (*), then 6. = min (f,, 9) 
is a satisfactory 6.. For if ||y|| < 1 — », then y(y) S ||y||-(l — 9) 


lly ||-(l — 6). And if 1 = |/y|| 2 1 — », let 2 = y/||y|!. Should 
||a2 — y|| 2 e hold, we must have 


1 


Ile —z|| = lle -—yll —lly—2z\| Ze — llyll- 1 sa = e—(1 — ||y|)) 
ihe a) 
ze—-n2%, 
where || z || = ||z/|| = 1. From (*) it then follows that || 7 + 2|| S$ 2-4, 
and therefore when y(r) = || y ||, it is true that 
y(z) = y(z + 2) — v(x) S (2 — &)-\l || — Il vl 


= (1 — §)-\|y || & (@ — &)- || 7h 


Thus y(y) = || y ||-y(z) S || y ||-(1 — 6). Theinequality y(y) || y ||-(1 - 8) 
then holds uniformly for all z, y, y satisfying the conditions || x |! = 1, || y || $1, 
|2 —y|| 2 ¢, and y(z) = ||7||. 

The second lemma is a sharpening of a result of Goldstine’s [3]. 


Lemma 2. If % is an arbitrary B-space and Fo(y) is any linear functional 
defined over X = [y], then there exists a function B(E) having the properties: 

(i) B(E) is defined, bounded, and additive over all subsets E of the unit sphere 
S of &; 

(ii) B(E) is non-negative; 








T 


; last 


e (*) 


nd if 











PROOF THAT EVERY UNIFORMLY CONVEX SPACE IS REFLEXIVE 251 


(iii) || Fo || = || 8 || = Var (6; S); 
fv) For) = [ v2) a8, y €, 
the integral in (iv) being the Radon-Stieltjes.* 


Given Fy, Goldstine has proved the existence of an a(£) satisfying (i), 
(iii), and (iv). Let a(Z) = r(E£) — v(E£) be the Jordan decomposition of a, 
where x and v have properties (i) and (ii). For each set E in S let P(E) be the 
projection of E through the origin, i.e., P(E) is the set of all z in S having —z 
in E, and define »(£) = »(P(E)). Then »(£) also has properties (i) and (ii), 
and w(S) = v(S). If BE) = xr(E) + w(£), it now follows that B(E) has 
properties (i), (ii), and (iii); the first two are evident and the third follows 
from the fact that 


| Fo || = Var (a; S) = r(S) + »(S8) = B(S) = Var (8; S), 


the last equality arising from 8 being non-negative. 
For every y « X we now obtain from the definition of the integral that 


[-1@4% = [1-ow = [1@ dn, 


ie, we make the “change of variable” x’ = —z. From property (iv) of a(£), 


Foy) = [ v(e)de — [ va)dv = [ vie)de + [ -v@) 


= [r@art [v@an - [@ dB. 


This completes the proof of Lemma 2. 

In proving the main theorem we may assume & itself to be uniformly convex, 
since any space isomorphic to a reflexive space is necessarily reflexive. 

We thus suppose ¥ to be uniformly convex, and seek to show that for each 
Fy eX there is an 2 € ¥ such that Fo(y) = y(ao) for all y « ¥. It is sufficient to 
consider only those points in ¥ having unit norm. For such an Fy there must 
be elements y, (n = 1, 2,--- ) in ¥ such that 


. . 1 

(1) llyal| = 1 and 1 = || Fol| = Fo(va) >1 — — 

By Lemma | there exists a sequence {z,} satisfying the conditions 

(2) | zn || = 1 and yal(2a) = || v0 || = 1. 

It will be proved first that {z,} is a Cauchy sequence and then that 2 = lim z, 


has the property that Fo(y) = (zo) for all y. 
For Fo(y) consider the B(E) of Lemma 2 which has the properties 


(3) 1 = || Fo|| = Var (6; S) = A(S), 


‘For a description of this integral see [5]. 











252 B. J. PETTIS 


(4) Fy) = [-@) ds, vk. 
If we use (4), (1) leads to the inequality 
1 
(5) 1-5 <f[rterds = [ rate)ds + [ ral2) as, 


where S,,. = S[||z — 2, || < ¢]. In view of (2) and Lemma 1, given ¢ > 
there exists a positive 6, that is independent of z, and has the property that 
1 — 6 > y,(z) holds for all z in S — S,,.. Since this last inequality holds and 
8(E) is non-negative, (5) gives us 


- : < [ _ tale) dB + (1 ~ 8)B(S — Sn.) 


or 
(5’) 1 — + <A(S,.) + (1 — 6)8(8 - 84.0), 

since (2) implies that | y,(z) | S || y.|| = 1 holds over S. The function A(B) 
being additive and 8(S) having the value 1, it follows from (5’) that 

(6) BlS — 8.) < (n = 1,2, +) 


This is the basic inequality of the proof. 

Choosing n, large enough to satisfy the condition 2 < n,é., we can now 
assert, thanks to (6), that if m = n, and n = n,, then S,».-Sn,. is not the 
vacuous set A. For then B(S) = 1, B(S — Sn.) < } and B(S — S,,) <} 
and these statements imply that 

B(Sy,e- Sm,e) - B(S = ((S — Se) + (S — Sm.e))) 
= B(S) — B((S — Sa.) + (S — Sm) 
> B(S) = (B(S ie Sn,e) + B(S meas Sm,e)) > 0. 


Hence Sy¢-Sm.. # A. But it can be inferred from the last statement that for» 
and m > nm, || 2m — In|| < 2e. Thus {z,} is a Cauchy sequence. 
The final assertion is that z) = lim z, has the desired property. To justily 
n 


this we use the fact that if So, = S[|| 2 — 20 || < «|, then 
(7) BUS — So.) = 0 for every ¢« > 0. 


If || t, — 2o|| < & = $e, then So. D S,, and hence S — S.C S — Say. 
This result and (6) imply that for n = n;, 


| 


0 S B(S — So.) S$ B(S — Srz) S >. 


Hence (7) holds. 








e> 
y that 
ds and 


t forn 


justify 











PROOF THAT EVERY UNIFORMLY CONVEX SPACE IS REFLEXIVE 253 
Now consider Fo(y) — y(xo); in view of (4) and (3) we can write 
For) — (xa) = [ vz) a — | v¢z) ag; 
s Ss 


hence, 8(Z) being non-negative and (7) being true, it follows that 


lA 


[\v@ - veo as = [ve - 4) |a8 


So,¢ 


|Fo(y) — y(2o) | 


<ivil-f le 20lldB < |-7|-e-8(S0.) = |l-rll-« 
istrueforany e > 0. Thus Fo(y) = y(2) for each y, and the proof is finished. 


REFERENCES 


1. 8. Banacu, Théorie des Opérations Linéaires, Warsaw, 1932. 

2.J. A. Cuarxson, Uniformly convex spaces, Trans. Amer. Math. Soc., vol. 40(1936), 
pp. 396-414. 

3. H. GotpstinE, Weakly complete Banach spaces, this Journal, vol. 4(1938), pp. 125-131. 

4. H. Hann, Uber lineare Gleichungssysteme in linearen Réumen, Jour. f. d. reine u. ang. 
Math., vol. 157(1927), pp. 214-229. 

5. T. H. HinpeBRaNpt, On bounded linear functional operators, Trans. Amer. Math. Soc., 
vol. 36(1934), pp. 868-875. 

6. D. MinmMan, On some criteria for the regularity of spaces of the type (B), Comptes Rendus 
de l’Acad. des Sc. de l’URSS, vol. 20(1938), pp. 243-246. 


UNIVERSITY OF VIRGINIA. 








DIFFERENTIATION IN BANACH SPACES 
By B. J. Pertis 


Introduction. Consider a figure’ Ry in a Euclidean n-space. According to a 
classical theorem of Lebesgue, if ¥ is the space of real numbers, then every ABY 
(additive, with bounded variation) function defined to X from the figures in R, 
is necessarily differentiable a.e.* in Ry. But, as Bochner first pointed out [4], 
this theorem does not hold for general Banach spaces ¥. There exist spaces } 
to which ABV functions may be defined that are differentiable at no point in Ry. 
Several authors [3, 6, 7, 10, 12, 15] have as a consequence considered the problem 
of finding conditions on X sufficient that every ABV function defined to & be 
differentiable a.e. Here, however, we wish to adopt a somewhat different view- 
point, at least throughout §§1 and 2, the sections fundamental to our dis 
cussion: in the principal theorems of the paper, given in §2, the emphasis has 
been placed on the individual function X, rather than on the space X and the 
class of all ABV functions having their values in ¥. Thus (to put it more 
explicitly) the conclusions reached in Theorems 2.5, 2.7, 2.8, and 2.9 state, 
with no restriction on X, that if a fixed ABV function X, defined to ¥ hasa 
generalized ‘‘weak”’ derivative according to any one of several definitions, then 
Xx is differentiable a.e.; that is, Xz has a ‘“‘strong’”’ derivative. In each of these 
four theorems it is shown that a set of necessary conditions, expressed in terms 
of linear functionals and apparently quite feeble, are actually of sufficient 
strength to insure differentiability a.e. 

The possible use of these results is not confined to testing the strong differ- 
entiability of an individual function having its values in an unrestricted (and 
perhaps unsatisfactory) space; the theorems can also be applied to the problem 
considered in the papers cited above, namely, that of testing whether or nota 
given condition which the space X is assumed to satisfy is strong enough to 
insure the differentiability a.e. of every ABV function defined to ¥. The results 
concerning differentiation that have been obtained in [3], [7], [10], [12], and [15] 
are here derived in §§3-5 from the theorems of the present §2; in each proof the 
essential idea is to show that if X satisfies the particular condition under con- 
sideration, then ¥ is weakly compact in one generalized sense or another. 

Following §1, in which the necessary definitions have been grouped, the 
principal theorems will be found in §2. Those dealing with differentiation we 


Received August 30, 1938; presented to the American Mathematical Society, September 
6, 1938. 

1 Terms used, but undefined, in the present paper will be found in either [18] or [I 
(numbers in brackets refer to the list of references at the end). It is supposed that the 
reader is somewhat familiar with these two treatises. 

? The phrase ‘‘almost everywhere’ will be abbreviated throughout to 


254 


a.e. 


éé 











toa 
ABY 
in R, 
t [4], 
ces ¥ 
n Ro. 
blem 


more 
state, 
has a 
then 
these 
terms 
icient 


liffer- 
(and 
»blem 
not a 
gh to 
esults 
di [15] 
of the 
r cOn- 


1, the 


on we 
ember 


or [I] 
at the 








DIFFERENTIATION IN BANACH SPACES 255 


believe to be new. In §3 a theorem of Gelfand’s given in [12] and [15] is ob- 
tained in slightly improved form (Theorem 3.1), Ro now being n-dimensional 
instead of linear and ¥ being supposed weakly compact* instead of reflexive.‘ 
Spaces with bases [10] are involved in §4. In §5 there is an extension of a 
theorem of Gelfand [12] concerning differentiation in spaces subjected to certain 
separability hypotheses; the extension can be applied to the arbitrary case. 
Some comments on the differentiation of abstract integrals occupy §6, and two 
concluding remarks, one on real-valued BV convex functions and the other on a 
certain assumption made in §2, compose the final section. 

Concerning notation, real-valued functions will usually be represented by 
Greek letters while those defined to more general B-spaces will be put in italics. 
Functions of points will be in lower case (zx, , y, , ete.) and functions of figures 
in capitals (Xx, Ye, ete.). Among the conventions followed in [1] to these 
two we shall especially adhere: (i) @ is the zero element of the particular B-space 
under discussion, and (ii) the adjoint spaces of X, 9, --- are denoted by 
i, 9), --- , respectively, and the adjoint spaces of ¥, 9), --- by ¥, 9), --- , respec- 
tively. The function z, which is identically @ will be written as @, . 


1. Preliminary considerations. In Euclidean n-space let Ry be a fixed figure 
and R any figure contained in Ry. A finite number R,, --- , Ry of figures 
lying in R are non-overlapping if R;-R; has a vacuous interior whenever 7 # j; 

A 
fit is also true that > R; = R, then these figures form a partition of R. Let 
i=] 
X, be a function having the figures R in Rp for its domain and a Banach space 
(B-space) X as its range. Then Xz, is said to be of bounded variation, or BV, if 
k 


lub. >> |! Xr, || < © as the partition = [R,, --- , Ry] ranges over all par- 


i=] 
titions of Ry. When Xx is BV, the auxiliary real-valued function 
P 
Var (Xz; R) =lub. > || Xp; |I, m a partition of R, 
i=l 


isdefined over the figures R in Ry, and is also BV. A function Xx is termed 
additive if Xx,ix, = Xe, + Xz, whenever R, and R:2 are non-overlapping, and 
ABV when it is both additive and BV. It is clear that an additive function 
is always convex; that is, Xz satisfies the inequality || Xe,+2, || < || Xe, || + 
Xx, || for all non-overlapping R; and R2. 

If s is a fixed point in Ry and if lim X,/ | J | exists as J ranges over all the 
non-degenerate cubes in Ry that contain s and as | J | , the measure’ of J, tends 


*For the definition of weak compactness used here see the beginning of §3; this defini- 
tion is Banach’s ([1], p. 239) with weak completeness added. 

‘The definition of a reflexive, or regular [13], space can be found in [12], [15], or [17]. 

Since reflexiveness implies weak compactness ({11] or [15]), the latter property is a 
formal weakening of the former. Whether or not it is an actual weakening is, as far as 
the author knows, still an open question. 

*Since only measurable sets will be involved in any of the proofs, the adjective 
‘measurable’ as applied to sets will usually be omitted. 











256 B. J. PETTIS 


to 0, then Xz is said to be differentiable at s. When Xz, is differentiable at » 
there exists in %, since ¥ is complete, a unique element z, such that z, = lim 
X;/|I | ; this element is the derivative of Xz at s, and Xx is differentiable to th 
value x, ats. The particular term singular is applied to an ABV function X, 
that a.e. in Rp is differentiable to the zero element @ of X; i.e., Xx is singular 
if it is ABV and if a.e. in Rp it is true that lim || X,|,/|J| = 0 when /is 
subject to the conditions above. In proofs involving either additive functions 
or convex functions we shall often find the following elementary theorem useful; 


If Xz is convex and BV, then 

(I) the real-valued function Qy = Var (Xx ; R) is ABV; 

(II) the difference quotients of Xx are bounded at almost every point of Ry ; 
(III) tf Qe is singular, Xx is differentiable to 6 a.e. 


Conclusion I is a direct consequence of Xz being convex and BV; the proof is 
left to the reader. Conclusion III follows immediately from the inequality 
0 < || Xz!) S Qe, and II results’ from combining this inequality with the 
fact that the real-valued function Qg, being ABV, must have its difference 
quotients bounded a.e. 

We now consider various definitions of generalized or weak derivatives. Let 
Xz» be defined to ¥ from the figures in Ry and let z, be a function defined ae 
in Ry and having its range in ¥. If 3 = [f] is a set in %, the adjoint of &, then 
Xx is said to be 3-differentiable to x, if there exists a measurable set S in k, 
such that (i) | S| = | Ro| and (ii) s e S implies that for every ¢ « 3 the real 
valued figure function ¢(X,) is differentiable at s to the value ¢(z,); under these 
circumstances z, is said to be a 3-derivative of Xz. If Xz has at least one 
3-derivative z, , and if any other 3-derivative y, of Xz is necessarily equivalent 
to z,, that is, the equality z, = y, must hold a.e. in Ry, then Xx will be said 
to have a unique 3-derivative. We note that if the set 3 of linear functional 
forms a total set over ¥, and if Xz has a 3-derivative, then this derivative is 
unique. In particular, if Xz has an X-derivative z,, then zx, is unique; Xz 
then said to be weakly differentiable a.e. and zx, is the weak derivative of Xx. 

More generally, we say that Xx is 3-pseudo-differentiable to x, (and 2, is 4 
3-pseudo-derivative of X x) if [16] for each ¢ « 3 the function ¢(X x) is differentiable 
a.e. to ¢(z.). If Xz has at least one 3-pseudo-derivative, and if any two sueh 
derivatives are necessarily equivalent, then X, is said to have a unique 3-pseude 
derivative. It is evident that any 3-derivative is a 3-pseudo-derivative, and 
that for denumerable 3 the converse is true. Thus if 3 is both denumerable 
and total and z, is a 3-pseudo-derivative of X, , then z, is a 3-derivative of Xs 
and this derivative is unique. 

By the span of a set @ in ¥ is meant the smallest closed linear subspace 9) in} 
that contains G. If mm, = [Raa,---, Rax,] (n = 1, 2, --- ) is a sequence 
partitions of Ry and Xz is a fixed ABV function, then the X,-span of the par 


*A proof that II holds for BV functions whether convex or not is contained in [7 
p. 409. 











Let 
d ae 
, then 

in k, 
> real- 
> these 


st one 
valent 
ye said 
ionals 
tive is 
Xz is 
a 
x, 184 
ntiable 
o such 
»seude- 
e, and 
verable 
> of X z 


9) int 
once of 


he par- 


1 in [7 








DIFFERENTIATION IN BANACH SPACES 257 


titions {,} is the span of the denumerable set {Xx, ;}. The span of x, (or of 
X,) is the span of the set of functional values of z, (or of Xx); if this span is 
separable, then z, (or Xx) is said to be separably-valued. A function z, that is 
equivalent to a separably-valued function will be referred to as essentially 
separably-valued. 

The two final definitions, which are fundamental for our purposes, are these: 


DEFINITION 1.1. If {yn} is a sequence in X and Y) is a set in %, then {yn} is 
said to have property N(Y) if |; yn || S 1 (nm = 1, 2, --- ) and || y || = lim sup 


n 


yn(y) | for every y « ¥). 


DEFINITION 1.2. Given Xz, an Xx-maximal sequence is a sequence r, = [Raa, 
n 


..+, Raw] (n = 1, 2, ---) of partitions of Ro such that lim >> || Xz,,; || = 
n--o i=] 
Var (Xz ; Ro). 


Concerning Definition 1.2 it is obvious that Xx-maximal sequences always 
exist forany Xz. Likewise, if a set 9) is contained in a separable closed linear 
subspace of X, i.e., if the span 3 of ¥) is separable, then there exists at least 
one sequence in X that has property N(Q)); it is sufficient to take a sequence 
'y,} weakly dense (as functionals) in the unit sphere of 3 ({1], p. 124, Theo- 
rem 4) and then extend each y,, to form an element 7, of % while preserving the 
norm. It is to be noted that from the last condition in Definition 1.1 it follows 
that every sequence {y,} having property N(¥)) forms a total set of functionals 
over 9). 


2. Necessary conditions that are also sufficient in order that an ABV function 
X, be differentiable a.e. A function z, is a simple function if it is constant on 
each of a finite number of measurable sets whose sum is Ry ; according to Boch- 
ner [5], x, is measurable if it is the limit a.e. of a sequence of simple functions. 
In proving the fundamental result of the present paper, two preliminary theo- 
rems concerning measurability and integrability of abstract functions will be 
used. The first, which gives a condition sufficient that z, be measurable, is an 
extension of a known theorem [12, 16). 


THEeoreM 2.1. Let x, have a separable span ¥). If y»(x.) is measurable for 
every Ym in some sequence {ym} that has property N(Q)), then x, is measurable. 


Since {y,,} has property N(%)), where 9) is the span of z, , it follows that for 
t property } 
every s we have || z,|| = lim sup | y»(z,) | , so that || x, || is measurable, being 


m 


the lim sup of measurable functions. About each of the points z, in the span 9) 
put an open sphere of radius n’. The space 9) being metric and separable, 
4denumerable number &”' (i = 1, 2, --- ) of these spheres together cover all 
the functional values of z,. If x” is the center of R”, then z™ « 9) and hence 9 
is the span of 2, — 2™; moreover, Ym(%, — 2") = Ym(%2) — Ym(2™) is clearly 
measurable for every i and m. Hence an argument used above shows that 











258 B. J. PETTIS 


|| z, — 2" || is measurable for each i. This implies that G"" = Rof || x. — 2” || 


li: e e ° ° ° j 
<n] is a measurable set in Ry, and since every point z, is in some ”, it 
j—1 


2 
follows that >> G" = Ry. Hence if H” = GY’ — DG", then H™ (j = 1,2 
i=l i=1 
... ) is a sequence of disjoint measurable sets whose sum is Ry. Now define 
x” to have the constant value 2” on the set H"’; this function is clearly meas- 


. *“,* -] . 
urable and satisfies the condition || 2, — x2 || < n™' for all s, since )) H”’ = R 
i 


and s « H™ implies that || z, — 2” || <n’. Thus z, can be uniformly approx- 
imated in Ry by measurable functions and therefore is itself measurable. 

The abstract integral defined by Bochner [5] plays an important rdle in the 
theory of B-space differentiation [6, 7, 10, 12, 15], since this integral is ABV and 
a.e. differentiable [5] and every derivative (when it exists) of an ABV function 
is integrable [7, Theorem 5], that is, integrable under Bochner’s definition. 
Recalling that a necessary and sufficient condition that z, be integrable is that 
it be measurable and that || x, || be summable in the Lebesgue sense, we shall 
find in the next theorem (which is a generalization of Theorem 5 of [15}]) a 
condition sufficient for the integrability of a pseudo-derivative. 


THEOREM 2.2. Suppose x, has a separable span 9). If 3 = \v;} has property 
N(Q), and if x, is a 3-pseudo-derivative of some function Xx , then x, is measurable. 
If in addition Xx is BV, then x, is integrable. 


For each j the function y,(z,) is a.e. a derivative and is therefore measurable. 
Since {y;} has property N(9)), x, must be measurable by Theorem 2.1. Let i 
be a cube containing Ro, and define X, = Xx-.x, foreach figure R’C Io. Then 
X} is BV if Xx is BV; and when 2, is extended to I by setting xz, =2,ink 
and zx, = 6 elsewhere, then X;. is 3-pseudo-differentiable to z.. Now lets, 
= [Ina,--+,Imue,] (m = 1, 2, --- ) be a sequence of partitions of Jp into non- 
degenerate subcubes J,,,; such that lim (norm z,,) = 0. If s is in the interior 


m2 
of I,;, define 2” as X;, ./ | Inm,; |; otherwise let z? = 6. For each m the fune- 
° — “a 8 oP « ° ° , 
tion 27" is integrable, and since Xx, is {y;}-pseudo-differentiable to z,, we have 


a.e. in Io 


\| 2. || = lim sup | ya(zx,)| = lim sup (lim | y.(z7")|) 
(1) . n Me m | : 
< lim sup (lim inf || y, ||-|| 2? ||) < lim inf || 2?" ||. 


If X;,- is BV, then 
(2) / || Z, \| ds 4 Var (X}. > To) = Var (3s; Ro) < x. 
I 


0 
+ sf + , . , . ° . 
From (1 2), and Fatou’s lemma it follows that || x, || is majorized over h 
, : J 
by a summable function, lim inf || z? ||. But x, inherits measurability from %, 


m 


, . = , ay 
so that || x, || is measurable. Thus measurable x, has || x, |! summable, so that 
4« . . “ae . . 
z, is integrable. The integrability of x, is now obvious. 








Fre 





yver iF 
mM 2;; 


o that 








DIFFERENTIATION IN BANACH SPACES 259 


The proof of the preceding theorem is essentially a vindication of the following 
more general statement. 


THEOREM 2.21. Let x, have a separable span 9). Then x, is measurable if 
there exist a sequence \y;} having property N(¥)) and a sequence of functions 
{7} such that 

(1) y;(x%") is measurable for 7, m = 1, 2, --- , and 

(2) yz.) = lim y,(z?) a.e. for each j. 


m 


If in addition 
(3) the indefinite integrals i || ay" || ds exist finitely or infinitely and 
R 


lim inf [ \| ae || ds < @, 
m Ro 


then x, is integrable. 


We are now ready to prove the theorem from which Theorem 2.5 will be 
easily derived. This is 

THEOREM 2.3. Suppose that Xx is an ABV function satisfying the following 
condition : 

(G) Among the X x-maximal sequences there is at least one, {r,}, such that if B 

is the separable X x-span of {2,}, then among the sequences having property N (8) 

there is at least one, {5;}, such that Xx is \6;}-pseudo-differentiable to the function 

é,. 

Then the real-valued ABV function Qe = Var (Xx ; R) is singular. 

Let mn = [Raa, --- , Rax,] (n = 1, 2,---). Since {6;} has property N(@), 
where YW is the X,-span of {z,}, clearly for each element Xx, ; of W there is a 
member 6,,; of {6;} such that 
l 
] 
) Qn k, 


Let Az = max [| 6,,(Xe)|;1 Sm Sn, 1 Si S ky]: then 








Xr,,; || — S |6n,(Xz,,;)|. 


(2) 0< Ar SA” S ||Xall, RC Ro, 


the last inequality on the right arising from the fact that || 6,.,;|| < 1 for all 


iandm. From (2), each Az is BV; moreover, A; is convex, being the maximum 


of a finite number of convex functions | 5,,;(Xzx) |. Since each Az is BV and 
convex, the function 2 = Var (Az; R) is ABV; in addition, (2) implies that 
(3) 05 2; <Q" S Qe, RC Ro, 


% that the limit function Q; = lim Q} exists, is ABV, and satisfies the inequality 


(4) 0< 2, SQ, RCR. 


From (1) on the other hand we can conclude that for any positive integer n 


kn kn kn 
0%, = 2 Az, = Ly | dni(Xe,,,)| = De || Xe,,: |] — 2°75 
i=l i=] 


i=1 











260 B. J. PETTIS 


if we take the limit as n — ©, this becomes, since {z,} is an Xg-maximal se- 
quence, 


(5) Qe, = lim Qe, = Qe,. 
n 


For an arbitrary R C Rp it now follows from (5) and (4) that 
, , , 
Qe = QR, = Qeo—R = Qe, — Qeo—R => Qe, 


and this combined with (4) results in the identity Q, = Qg for all RC Ro. 

Thus {Q;} is a monotone increasing sequence of ABV functions converging 
to Qg. If it is shown that each Q) is differentiable a.e. to 0, that is, each Q} 
is singular, then Qe will necessarily be singular ({18], p. 94, Theorem 12.1, 3°) 
and the theorem will be established. But for each i and m the ABV function 
5m,i(Xe) is differentiable a.e. to 0, since by assumption Xx is {6;}-pseudo- 
differentiable to 6, = 6. The set of singular functions being closed under the 
operations of addition and of taking the total variation ({18], p. 94, Theorem 
12.1), it follows that the function 


n km 


Ar =D © Var (6n(Xx); R) 


m=1 i=1 


must be singular for each n. Since 


n kn 


AR > Var (D> > | bnu(Xe)|; R) = Var (An; R) = Q2 = 0, 
1 


m=1 i= 
it is now seen that each Q; is also singular. This completes the demonstration. 


THeoreM 2.4. If Xz is ABV, then the following statements are equivalent: 

(2.41) Xx satisfies condition (G) of Theorem 2.3; 

(2.42) Xx is differentiable to 0 a.e.; 

(2.43) the function Qe = Var (Xx ; R) ts singular; 

(2.44) for each « > 0 there exists an open set E, such that | E,| < ¢€ and Var 
(Xe; E£.) = Var (Xz; Ro). 

Since Q, is ABV and non-negative, it is its own variation; the equivalence of 
(2.43) and (2.44) results from combining this fact with a well-known theorem 
in real-function theory ({18], p. 121, Theorem 7.8). That (2.42) follows from 
(2.43) is stated in III of §1; and (2.41) implies (2.43) by Theorem 2.3 above. 
Thus the only implication that remains to be established is that (2.42) implies 
(2.41). 

If Xz is ABV, let {x,} be an Xy-maximal sequence and let YW be the separable 
X,-span of {x,}. According to a remark made in §1, there exists in ¥ at least 
one sequence {6;} having property N(Q8). If Xz is differentiable to 6 a, 
then clearly X, is ¥-differentiable to @, = @, and hence Xz is certainly {8;}- 
pseudo-differentiable to @,. Thus Xx has property (G), and (2.42) implies 
(2.41). 











tion. 








DIFFERENTIATION IN BANACH SPACES 261 


The equivalence of (2.42) and (2.44) has been previously established in [10]. 
The next theorem, our principal result, is obtained almost immediately from 
Theorems 2.2 and 2.3. 


THEOREM 2.5. Let Xx be ABV. Suppose that x, is a function having a sepa- 
rable span Y) and that Xx is \y;}-pseudo-differentiable to x,, where {y;} is one 
of the sequences in X having property N(Y). Then the integral Ye = [«. ds 


R 
erists. In addition suppose, Zz being the ABV function Xp — Yr, that Xx is 
\6;|-pseudo-differentiable to x, for some sequence {6;} having property N(), 
where YW is the separable Zx-span of some Zr-maximal sequence {r,}. Then 
X, is differentiable a.e. to x, . 


The first assertion is merely a repeated statement of the previously estab- 
lished Theorem 2.2. The integral Y, being differentiable a.e. to its integrand 
z,, it follows that Zz is the difference of two functions each of which is {6;}- 
pseudo-differentiable to z,. Since this implies that Ze, is {8;}-pseudo-differ- 
entiable to @, , it is readily seen that Z» satisfies condition (G) of Theorem 2.3 
and therefore is differentiable to 6 a.e. This proves the theorem, since Xz = 
Ye + Ze, where Y x is differentiable a.e. to z, . 

As an immediate inference of Theorem 2.5 we have 


THEOREM 2.6. If Xz is ABV and x, is separably-valued, and if there is a 
sequence {y;} having property N(X) and such that Xx is |y;}-pseudo-differentiable 
lox, , then x, is integrable and Xx is differentiable a.e. to x, . 


The next theorem is 


THEorREM 2.7. If Xx is ABV and if there exists a separably-valued function 
1, such that Xx is |y;}-pseudo-differentiable to x, for every sequence \y;} in & 
then x, is integrable and Xx is differentiable a.e. to x, . 


, 


Let }) be the separable span of z,. As we have noted in §1, the separability 
of ) implies the existence of a sequencé {y,} in ¥ that has property N(9); since 
1,18 a |y;}-pseudo-derivative of Xz, x, must be integrable. Now let {x,} be 


any Zy-maximal sequence, where Zz = Xz — [ ds; the Zx-span W of {x,} 
R 


is separable and hence there exists a sequence {6;} having property N(). 
from the assumption in the theorem X, must be {4,}-pseudo-differentiable to 
1,. All the hypotheses of Theorem 2.5 are now fulfilled, and the present con- 
clusion follows. 

Theorem 2.7 can be rephrased as 


THeoreM 2.8. If Xx is ABV and z, is separably-valued, then if Xx is ¥-pseudo- 
lifferentiable to x, , it follows that x, is integrable and is the derivative a.e. of Xx . 


This in turn permits the following improvement to be made on a result ob- 
tained previously ((15], p. 427). 











262 B. J. PETTIS 


THEeoreM 2.9. If Xx is ABV and %-differentiable to x, , that is, Xx is weakly 
differentiable to x, a.e., then x, is integrable and X x is differentiable a.e. to x,.’ 


Since z, is an X-derivative, it must be equivalent to a separably-valued fune- 
tion y, ({16], Theorem 1.2). The function y,, being separably-valued and 
obviously an X-derivative of Xx, is a.e. the derivative of Xz by Theorem 2.8, 
Our conclusion now follows from the equivalence of y, and z, . 

In concluding this section we remark that Theorem 2.9 immediately implies 
a result of Clarkson’s [7], to the effect that if Xx is ABV and differentiable a.e., 
then the derivative function is integrable. The derivative is essentially sepa- 
rably-valued, and hence X, is differentiable a.e. to a separably-valued integrable 
function z,. It can easily be seen that the two functions Xz and x, possess all 
the properties demanded in the hypotheses of Theorems 2.5, 2.7, 2.8, and 2.9. 
The conditions shown to be sufficient in those theorems are therefore also 
necessary in order that Xx be differentiable a.e. 


3. Weakly compact spaces. In [12] and [15] it was shown that if Rp isa 
linear figure and & is a reflexive B-space, then the ABV function Xx, is differen- 
tiable a.e. In the next theorem we remove from this statement the restriction 
that Ry be linear and at the same time formally weaken the requirement of 
reflexiveness for ¥ by substituting weak compactness; that is, we suppose that 
in every bounded sequence in X there is a subsequence converging weakly fo an 
element of X. Accordingly we state 

THeoreM 3.1. Jf Xx is ABV and & is weakly compact, then Xx is differen- 
tiable a.e. in Ro. 

Let Jp be a cube containing Ry and {z,} a sequence of partitions of J into 
non-degenerate subcubes, with r, = [Jna, --- , Zn,x,] and lim (norm z,) = 0. 


If s lies in the interior of J,,; and Ro > I,,,; , define x; to be Xz, ;/ | In,i | ; other 
wise let z/ = 6. From II of §1 and the weak compactness of ¥ it follows that 
for almost every s e« Ry the sequence {z;} contains a subsequence converging 
weakly to an element z, of ¥. The function z, is then separably-valued since 
its span must ((1], p. 134) lie in the span of the sequence {2;} of simple functions. 
Moreover, from the definitions of {z,}, {27}, and z, it is seen that for each 
y €X the derivative of y(Xz) must coincide a.e. with y(z,). Theorem 3.1 is 
now an immediate implication of Theorem 2.8. 

Every reflexive space being necessarily weakly compact, we have 

THeoreM 3.2. Any ABV function defined to a reflexive space is differ- 
entiable a.e. 

Since X is reflexive if it is isomorphic to a uniformly convex space [14, 17], 
the following result of Clarkson’s [7] is in turn a corollary of Theorem 3.2. 

7 This theorem furnishes another example of a weak property implying the corre 
sponding strong property. For further examples see [9] and various general mean ergodic 


theorems that have lately appeared. 
8 A result obtained by Gantmakher and Smulian [11]. 








ho 


con 











DIFFERENTIATION IN BANACH SPACES 263 


TuHEeorEM 3.3. If Xx is ABV and has its values in a space X that is isomorphic 
toa uniformly convex space, then Xx is differentiable a.e.” 


4. ¥ has a base. Suppose that ¥ has a base {2;}, so that every zr eX hasa 

representation z = > ¢i(x)x,;, where {¢,} is a sequence in X that is independent 
i=1 

ofz. Let U be the point-set product of the unit sphere of ¥ and the separable 

span 3 of {f:}. Each ABV function Xx defined to ¥ can be written as Xz = 


> ¢(X.x)z;, where for each 7 the function ¢;(Xx) has a derivative a.e.; let ¢ be 


i=l 

this derivative where it exists. If we consider a denumerable set dense in U 
and consisting of finite linear combinations of the ¢;’s, the next theorem can be 
inferred directly from Theorem 2.6. 


THEOREM 4.1. Suppose the base {x;} has the property that 


|| z || = lin sup | (2) |, ze. 


Then if Xz is ABV and if z, = >, ¢iz exists a.c. in Ro , Xz must be differentiable 


i=l 
ae. to Z,. 


From Theorem 4.1 the following result, obtained by Dunford and Morse [10] 
can be drawn. 


THEOREM 4.2. If X has a base {x;} with the property 


(A) lim sup | > a;x;|| < «© implies that _ a; x; 1s convergent, 
n i] i=l 


|| i=1 
then any ABV function Xx defined to X is differentiable a.e. 


In the proof given in [10] two preliminary facts are established: 
(i) there is no loss of generality in assuming that {z;} has the property 
n+l 


(B) the inequality > a;2;\| S . a;x; || holds for any constants ay, «++ , Qn413 
i=l i=l 


(ii) properties (A) and (B) imply that z, = y &,2; exists a.e. 
i=l 


In view of (ii) and Theorem 4.1, the present theorem will be justified if the in- 
equality || 2 || < lim sup | (xr) | is shown to hold for every x (the reverse 
u 


inequality is obvious). To do this fix z, take y « ¥ such that || y || = 1 and 
n 

z|| = | y(x) |, and consider the element y, = >, y(zr)¢; in 3. Property (B) 
i=] 


shows that ||y, || S 1, so that y, must also be in U. The desired inequality 
now follows from this and from the fact that || x || = | v(x) | = lim | y,(z) | . 


* For the case of a linear figure Ro it is noted in [14] that this result of Clarkson’s is a 
consequence of the theorem of Gelfand’s cited at the beginning of this section. 











264 B. J. PETTIS 

5. Separability assumptions and their application to the case of a general ¥. 
Suppose that X = [z] is the adjoint of a separable space ¥) = [y], and let {y;} 
be a sequence dense in the unit sphere of 9). For each y « Y define the fune- 
tional y,(r) = x(y) over X; then 9)’ = [y,] is a subset of ¥ and forms a total 
set of linear functionals over X. If Xx and z, are defined to X, we shall say that 
Xx is ¥)-(pseudo-)differentiable to x, if Xx is ¥’-(pseudo-)differentiable to z, 
in the sense of our previous definitions. Writing y; for the element y,, of 9 
and noting that {y;} has property N(X) and is therefore total over X, the fol- 
lowing amplification of Theorem 4 of [12] can now be stated. 

THEOREM 5.1. Let Xx be defined to such a space X. If Xx has its difference 
quotients bounded at almost every point in Ry , and if y;(Xx) is differentiable ae. 
in Ro for each j, then Xx has a unique 9)-derivative x, , and x, is measurable if it is 
essentially separably-valued. 

Hence if Xx is ABV and defined to the adjoint of a separable space 9), then X, 
has a unique ¥-derivative” x, ; moreover, if x, ts essentially separably-valued, 
then x, is integrable and Xx is differentiable a.e. to x, . 


From the assumptions made in the theorem there exists a set S in the interior 
of Ry having the three properties: (i) | S| = | Ro |; (ii) at each point of S the 
difference quotients of Xz are bounded; and (iii) at each point of S the function 
vi(Xx) is differentiable for every 7. Let {x,} be a sequence of difference quo- 
tients of X, defined by non-degenerate cubes closing down on a fixed point 
sin S. Thenlim sup, 2, || < ©, and lim y,(z,) = lim z,(y;) exists for everyj 

n n n 


where {y;} is dense in 9). This implies that {z,} converges to an z, in &, in 
the sense that for every y we have 


(1) v,(x.) = 2.(y) = lim x,(y) = lim y,(z,), 


n 


and in particular 


. d . 
(2) yi(z.) = lim y;(z,) = as 7;(Xr) 


for every j by virtue of (iii). Thus if x; is the limit of any other sequence of 
difference quotients closing down on the point s, from (2) we have y,(z.) = i 


v¥;(Xx) = y,(z.) for every j, and hence x, = x, since {y;} is total over ¥. This 
together with (1) implies that for every y the derivative of y,(Xx) exists ats 
and has the value y,(z,). Since | S| = | Ro|, it is now clear that Xz is J- 
differentiable to x, ; and since ¥’ is total over X, this 9)’-derivative must be 
unique. 

1° Tt is to be noted that “‘9)-derivative’’ can not be replaced here by the stronger term 
““¥-derivative”; see (B) of §7 and Theorem 2.9. The “weak derivative” ascribed to Xz 
in Theorem 4 of [12] must be understood as the 9)-derivative. The difference quotients 
of Xx converge weakly a.e. as functionals over %, but not necessarily do they converge 
weakly a.e. as elements of &. 








aa > es 2s eee 





rior 
the 
tion 


quo- 
oint 
ry j 
t, in 








DIFFERENTIATION IN BANACH SPACES 265 


If z, is essentially separably-valued, then if we disregard values on a set of 
measure zero, it is separably-valued and yet still remains the 9)’-derivative of 
X,. Since the sequence {y;} has property N(X), Theorem 2.2 now yields that 
z,is measurable. Thus the first part of the theorem is justified. 

The fact that Xz has a unique ¥)-derivative z, if Xz is ABV follows from the 
preceding when it is recalled that an ABV function has its difference quotients 
bounded a.e. in Ro and that y(Xz) is a.e. differentiable for each y « X. The last 
statement in the theorem is a direct consequence of Theorem 2.6 since Xx is 
\y;}-pseudo-differentiable to x, , where {y;} has property N(%). 

Because the values of z, all lie in ¥, an obvious corollary ({12], Theorem 3) is 


CoroLtLary 5.11. If the space ¥ of Theorem 5.1 is separable, then any ABV 
function Xx defined to & is differentiable a.e. to its 9)-derivative x, . 


These results still have application even in the case in which Xz assumes 
values in an arbitrary space 83. In this connection we may point out 


THEOREM 5.2. Any BV function Z, defined from a linear interval to an ar- 
bitrary space 3 must have a separable span BW. 


If Ro is a linear interval a S s S b and Z, is a BV function defined from the 
points of Ro to an arbitrary space 3, then if we utilize a certain parameter 
transformation s = a(t) ({10], §4), a function X, = Z,,) can be obtained which 
satisfies a Lipschitz condition and whose functional values include those of Z, . 
Since X, is continuous over a linear interval, its span is separable, and hence 
Z, has a separable span ¥ in 3. 

Thus if Z, is any BV function defined from a linear interval to an arbitrary 
space 3, there must be in 3 a sequence {y,;} having property N(%). Letting 9 
be the span of the sequence {y,} in 3, Z, can be considered as defined to the 
adjoint 9) of the separable space 9). As such, it must have a unique 9-deriva- 
tive. If this 9)-derivative is essentially separably-valued in 9), then Z,, con- 
sidered as defined to 9), must be differentiable a.e. Since Y) is the span of a 
sequence having property M(®), it can be shown further that Z, is then differ- 
entiable a.e. when considered as a function defined to 3. 


6. Remarks on the differentiation of abstract integrals. In this section 
we should like to make a comment or two concerning the differentiation of 
certain abstract integrals that have been constructed for functions having the 
points in Ro for their domain and an arbitrary B-space % for their range. Only 
the following three distinct definitions of integrability, listed in decreasing order 
of generality, will be considered. 


DEFINITION 6.1. x, ts (X) integrable [9, 12, 16] if y(xs) is summable Lebesgue 
for every y € X and if for each measurable set E < Rp there exists an element rz € X 


wh that y(2x) = [ v(x.) ds for ally e&. The integral (%) i x.ds of 2, over E 
E gE 


is by definition this element ze . 











266 B. J. PETTIS 


DEFINITION 6.2. x, is (D) integrable [8, 16] if it is measurable and (X) integrable. 
The (D) integral of x, over E is taken to be (X) / x, ds. 
gE 


DEFINITION 6.3. <2, is integrable if it is Bochner integrable. 

If z, is integrable, then it is (D) integrable [8, 16], and if it is (D) integrable, 
then it is (X) integrable [16]; moreover, if z, is integrable according to any two 
of these definitions, then the two integrals of x, coincide over every measurable 
set [3, 8, 16]. All of these integrals are absolutely continuous and completely 
additive [3, 5, 8, 16]; in addition, every Bochner integral is BV, differentiable 
a.e., and also absolutely additive, that is, >. | / xz,ds\||< « whenever {£,| 

|| Je 1 


n=l | 
are disjoint measurable sets. The next theorem, the proof of which involves a 
simple application of Theorem 2.8, serves to make more precise the distinctions 
between the three integrals. 


THEoREM 6.4. If x, is (D) integrable and Xz = (D) i x, ds, then the following 
gE 


conditions are all equivalent: 


(6.5) Xx is absolutely additive; 
(6.6) X, is BV; 
(6.7) [ \zlids < @; 
Ro 
(6.8) x, is integrable. 


Each of these conditions implies that 
(6.9) Xx is differentiable a.e. to x, . 


Since z, is supposed measurable, the real-valued function || z, || must be 
measurable, so that the integral in (6.7) always exists, either finitely or infinitely. 
It is known [5] that (6.8) implies (6.5), (6.6), and (6.9), and that for a measurable 
zx, the conditions (6.7) and (6.8) are equivalent [5]. There remain only the 
proofs that (6.5) and (6.6) each implies (6.8). 


Suppose that zx, is (D) integrable and that Xz = (D) i z,ds = (X) [ sae 
R R 


is BV. Then y(Xz) = [x@ ds holds for every y, and y(X.) is differentiable 
R 


a.e. to y(z,) since it is the Lebesgue integral of y(z.). The ABV function X: 
is then ¥-pseudo-differentiable to z,. Since x, is measurable by assumption, 
it is essentially separably-valued. From Theorem 2.8 it follows that 2, i 
integrable. Thus (6.6) implies (6.8). 

If x, is (D) integrable, then [16] there exists a sequence {z;} of (D) integrable 
functions such that (i) each z? has only a countable number of functional 











rable. 


‘able, 
r two 
rable 
etely 
‘lable 


{Ri 
(Lnj 


ves a 
‘tions 


owing 


ist be 
nitely. 
urable 
ly the 


r 


R 
tiable 
on Xz 
ption, 
z, 38 


grable 
tional 








DIFFERENTIATION IN BANACH SPACES 267 


values, and (ii) || 2, — x} || < n™ holds uniformly a.e. in Ro. Assume that / zr, ds 
E 


is absolutely additive; since for each n and each set E we have 


- i E 
[x ds < [x ds || 4! | 
gE gE \| n 
the integral [ x; ds must be absolutely additive for each n. It now follows 
E 


that z; is integrable. According to (i) there exist disjoint measurable sets 


Lo) 
E,; (« = 1, 2, --- ) such that z} has a constant value z,, over Z,,; , and + E,,; 


i=1 


= Ri}. Both z} and || 2; || are measurable, and since / z; ds is absolutely 
E 


additive, we can write 


io) 2 2 
| || ae || ds _ D || tas || +] Bas | — Dd || zec-| Ens | || = > / x; ds “ @, 
Ro i=1 i=1 t=1 Eni 


Thus the measurable function z; has || x; || summable, and zx; must be in- 


tegrable. Combining this with (ii) we obtain the conclusion that z, is a.e. 
the uniform limit of integrable functions, and therefore z, is itself integrable. 

The hypothesis of (D) integrability cannot be replaced by the weaker assump- 
tion of (¥) integrability. Example 1 of [3] gives a non-measurable (%) integrable 
function whose (X) integral satisfies (6.5), (6.6), and (6.7), but not (6.8) or (6.9). 
Another example, the seventh, in [3] shows that (6.9) is not equivalent to any 
one of the conditions (6.5)—(6.8) even for (D) integrable functions. 


7. Two concluding remarks. (A) In the second part of the proof of Lemma 1 
of [10] there is implicitly established this proposition: if Xx ts a real-valued BV 
conver function” of figures, then Xx satisfies condition (2.44) if it satisfies condition 
(242). This leads immediately to the following extension of a theorem which 
is well known for ABV real-valued functions. 


THeoreM 7.1. For real-valued BV convex functions of figures the conditions 
(2.42), (2.43), and (2.44) are all equivalent. 


From the above remark (2.42) implies (2.44). And since the ABV function 
%& = Var (Xx ; R) is non-negative and hence identical with its own variation 
function, it follows that (2.44) implies (2.43) since Theorem 7.1 is known to 
hold for ABV functions. The remaining implication that (2.42) results from 
(2.43) was proved in the introduction. 

If we utilize Theorem 7.1, the argument that established Theorem 2.3 may 
be repeated to justify the following generalization. 


"Such functions are closely allied to what Banach has calied normal functions [2]. 
If Xe is real-valued convex and BV, then —! Xx | is normal, and hence (loc. cit.) is differ- 
titiable a.e. On the other hand, every normal function is the difference of two non- 
negative convex BV functions. 








268 B. J. PETTIS 





THEOREM 7.2. Let Xz be a convex BV function having its values in an arbitr 
B-space, and suppose that Xz satisfies the following condition: 
(G’) Among the Xx-mazimal sequences there ts at least one, {rn}, such that} 
Y) is the separable Xx-span of {x,}, then among the sequences having property 
N(Q)) there is at least one, {y;}, such that y;(Xx) is convex for each j and X, 
is {y,}-pseudo-differentiable to 0, . 
Then the real-valued ABV function Qg = Var (Xz ; R) is singular. 


A combination of Theorems 7.1 and 7.2 leads to 


TuHeorEM 7.3. If Xz is conver and BV, then (2.42), (2.43), and (2.44) an 
equivalent conditions, and Xx satisfies all three if it satisfies condition (G’) ¢ 
Theorem 7.2. 


(B) Since the property of being essentially separably-valued is obviously 
necessary condition that z, be the derivative a.e. of an ABV function (see, for 
example, the proof of Theorem 3.1), it is reasonable to include this property in 
any set of sufficient conditions, as we have done in Theorems 2.5-2.8 and in 
Theorem 5.1 of §5. The question might arise, however, as to whether or not 
a set of sufficient conditions that includes this assumption still remains sufficient 
when the assumption is omitted. A single example answers this in the negative 
for all the above cited theorems. Let ¥ be the non-separable space consisting 
of the bounded sequences of real numbers, and consider the example given by 
Clarkson ({7], p. 414) of an additive function X, of linear figures that is defined 
to ¥ and satisfies a Lipschitz condition, yet is nowhere differentiable. Here 
is the adjoint of the separable space 9) composed of absolutely convergent series, 
so that by Theorem 5.1 Xx has a unique 9)-derivative z,. On inspecting the 
actual example, we see that z, fails to be essentially separably-valued and hene 
fails to be integrable; yet Xz and its 9-derivative x, satisfy all the remaining 
hypotheses in Theorems 2.5-2.8 and the theorems of §5. 


REFERENCES 


1. S. Banacn, Théorie des Opérations Linéaires, Warsaw, 1932. 

2. S. Banacu, Sur une classe de fonctions d’ensemble, Fund. Math., vol. 6(1924), pp. 
170-180. 

3. G. Birxuorr, Integration of functions with values in a Banach space, Trans. Amer. 
Math. Soc., vol. 38(1935), pp. 357-378. 

4. S. Bocuner, Absolut-additiv abstrakte Mengenfunktionen, Fund. Math., vol. 21(1933), 
pp. 211-213. 

5. S. Bocuner, Integration von Funktionen, deren Werte die Elemente eines Vektorrauma 
sind, Fund. Math., vol. 20(1933), pp. 262-276. 

6. S. Bocuner anp A. E. Taytor, Linear functionals on certain spaces of abstractly-valuel 
functions, Annals of Math., vol. 39(1938), pp. 913-944. 

7. J. A. Cuarxson, Uniformly convex spaces, Trans. Amer. Math. Soc., vol. 40(1936), pP. 
396-414. 

8. N. Dunrorp, Integration of abstract functions, Bull. Amer. Math. Soc., vol. 42(1936), 
p. 178 (abstract). 

9. N. Dunrorp, Uniformity in linear spaces, Trans. Amer. Math. Soc., vol. 44(1938), 
pp. 305-353. 


S 








arbitran) 


ch that if 


¢ 


Properly 


and X, 


2.44) an 
(G’) gf 


ously 3 
(see, for 
perty in 
3 and in 
r Or not 
ufficient 
negative 
sisting 
riven by 
- defined 

Here } 
it series, 
ting the 
id hence 


maining 


24), Pp. 
3. Amer. 
21 (1933), 
yrraumes 
y-valued 
136), PP 
12( 1936), 
4 (1938), 





10. 


ll. 


3. 


DIFFERENTIATION IN BANACH SPACES 269 


N. Dunrorp anp A. P. Morse, Remarks on the preceding paper of James A. Clarkson, 
Trans. Amer. Math. Soc., vol. 40(1936), pp. 415-420. 

V. GANTMAKHER AND V. Smvutian, Sur les espaces dont la sphére unitaire est faiblement 
compacte, Comptes Rendus de |’Acad. des Sc. de l’URSS, vol. 17(1937), pp. 91-94. 

I. GELFAND, Zur Theorie abstrakter Funktionen, Comptes Rendus de |’Acad. des Sc. 
de l’URSS, vol. 17(1937), pp. 243-245. 

H. Haun, Uber lineare Gleichungssysteme in linearen Raumen, Journal fiir die r. u. a. 
Mathematik, vol. 157(1927), pp. 214-229. 


. D. Mian, On some criteria for the regularity of spaces of type (B), Comptes Rendus 


de l’Acad. des Se. de l’URSS, vol. 20(1938), pp. 243-246. 


5. B. J. Perris, A note on regular Banach spaces, Bull. Amer. Math. Soc., vol. 44(1938), 


pp. 420-428. 

B. J. Pettis, On integration in vector spaces, Trans. Amer. Math. Soc., vol. 44(1938), 
pp. 277-304. 

B. J. Pettis, A proof that every uniformly convex space is reflexive, this Journal, vol. 
5(1939), pp. 249-253. 


. 8. Saxs, Theory of the Integral, Warsaw-Lwéw, 1937. 


UNIVERSITY OF VIRGINIA. 











NON-COMMUTATIVE ARITHMETIC 
By R. P. DitwortH 


1. Introduction and summary. The problem of determining the conditions 
that must be imposed upon a system having a single associative and commu- 
tative operation in order to obtain unique factorization into irreducibles has 
been studied by A. H. Clifford [1],’ Kénig [1], and Ward [2]. The more general 
problem of determining similar conditions for the non-commutative case has 
been treated by M. Ward [1]. However, the conditions given by Ward are more 
stringent than those satisfied by actual instances of non-commutative arith- 
metic, for example, quotient lattices and non-commutative polynomial theory 
(Ore [1, 2]). Moreover, in both of these instances the factorization is unique 
only up to a similarity relation, and instead of a single operation of multipli- 
cation the additional operations G. C. D. and L. C. M. are involved.” Ac 
cordingly, we shall concern ourselves with the arithmetic of a non-commutative 
multiplication defined over a lattice. 

As the decomposition of lattice quotients gives an important instance of non- 
commutative arithmetic, we shall summarize here a few of the fundamental 
ideas of Ore’s theory (Ore [1]). Let = be the set of quotients’ 


ay 
a=— a2 > ay, a, de L, 


a,’ 

where J is a lattice in which the ascending chain condition holds. If 8 = 
b,/b2 , we define (a, 8) = (a; , b:)/(a2 , be), [a, 8B] = [ar , bi]/[a2 , be]. With these 
definitions = is a lattice which is modular or distributive if and only if L is 
modular or distributive. Ore defines the product a-8 only for elements a, 8 ¢2 
such that az = b; , in which case a-8 = a;/b.. Let us set a = B if and onlyif 
a, = b,, so that a necessary and sufficient condition for the existence of the 
product a-8 is that a = 8. Although the relation = is neither reflexive nor 
symmetric, it is in a certain sense transitive since 


(1) ifa = Bandy = 6, theny = 6 implies a = 6. 
Furthermore, the relation is preserved under union and cross-cut; that is, 


(2) a = B, y = 5 implies (a, y) = (8, 6), [a, y] = [B, 4]. 


Received September 2, 1938. 

1 The numbers in brackets refer to the references at the end of the paper. 

? In this regard note that the necessary and sufficient conditions for unique factoriza- 
tion in the commutative case are stated in their most elegant form in terms of the G. C. D. 
operation (K6énig [1]). 

* Our inclusion is the reverse of Ore’s. 

270 








NY 





itions 
nmu- 
s has 
neral 
e has 
more 
arith- 
heory 
nique 
Itipli- 
Ac 
tative 


f non- 
rental 


toriza- 
LC 








NON-COMMUTATIVE ARITHMETIC 271 


Also 


(3) a = 8,8 = y implies a = B-y, a-B = y. 


In the abstract theory given below we shall choose (1), (2), and (3) as the 
defining properties of the abstract relation =. 

When we have a commutative multiplication, the connection of the multi- 
plication with the lattice operations automatically makes the lattice modular 
(in fact, distributive (Ward-Dilworth [1, 2])). If the multiplication is non- 
commutative, the lattice need not be modular; however, the assumption of the 
modular condition is essential since we shall prove that it is one of the necessary 
and sufficient conditions for arithmetic in a non-commutative semigroup. In 
particular, our results show the importance of the modularity‘ of a non-commu- 
tative polynomial domain in determining its arithmetical properties. 

We conclude by stating our fundamental decomposition theorem for the 
elements of a lattice = with a multiplication having the properties of §2. 


DecoMPOSITION THEOREM. Each element a of = not equal to a unit has a de- 
composition into irreducible elements. If there are two such decompositions 


a = PrPrri-+: PePr = WsQs-1--- G2U; 


then r = s and the p’s and q’s are similar in pairs. 


2. The multiplication.” Let = be a lattice in which the ascending chain 
condition holds and let 7 denote its unit element. Consider in = a relation = 
having the following properties: 

Tl. For each a ¢ = there are elements a’, a’ « = such that a = a’, a” = a. 
12. a=b,a =c,and°b =d—-c = d. 

T3. a = b,c = d— (a,c) = (b, d), [a, c] = [b, d]. 

T4. Ifa = b,c = d,thenc = ba =< d. 

DEFINITION 2.1. Let L, denote the set of elements x such that x = a. 

By T1 and T3, L, is non-empty and closed with respect to union and cross-cut. 
Hence L, is a sublattice of =. Thus with each element a of = we associate a 
sublattice La . 

In a similar manner we may associate with each element a of = a sublattice 
S, defined as the set of all z’s such that a = z. 


THEOREM 2.1. The lattices L, and Ly are either disjoint or they are identical. 


Proof. If L, and Ly are not disjoint, they have an element c in common. 
let be an arbitrary element of L,. Then x = a,c = a,c = b, and hence 
t=bby T3. Similarly, each element of L, belongs to L, . 


‘That a non-commutative polynomial domain is modular can be easily seen from the 
fact that the degree of a polynomial is a rank function over the lattice in the sense of 
Birkhoff (Birkhoff [1], Ore [2]). 

‘See Ward-Dilworth [1, 2] for lattice notation. 

* We shall use — to denote “‘implies’’. 











272 R. P. DILWORTH 


Clearly, a similar result holds for S, and S,. 

Let now L, and L,, be the L-lattices corresponding to s, s’«S,. Then 
aeL,, L, and hence L, and L,: are identical by Theorem 1.1. Thus to each 
S-lattice we can associate an L-lattice L, where L is the L-lattice corresponding to 
an arbitrary element of S; and conversely, to each element of L corresponds the 
S-lattice S. 

DEFINITION 2.2. We write a ~ b if a and b belong to the same L-lattice. 

~ is clearly an equivalence relation in = with the Z-lattices as equivalence 
classes. 

We consider now a multiplication over > having the following properties: 
M00. To each pair of elements a, b such that a = b, there is ordered a unique 
element ab, the product of a and b. 

MO.’ a = b— ac = be, da = db. 
Ml. With each a « & there exists an element uz = a such that u,a = a. 
M2. a> ba. 
M3. (a, b)c = (ac, be). 
M4. (ab)c = a(bc). 
M5. ac = bea = b. 
M6. a=b,b=c—a = be, ab=c. 
From M00-M6 follow 


(2.1) 


Proof. ua = aby Ml. Hence uz = ab by M6. But c = ab and thus 
c =aby T4. A similar proof gives the second statement. 


(2.2) Ug is the unit element of L, . 

Proof. Let zeL,. ThenaD za — ua = a = (a, ra) = (ua, za) = 
(u., x)a by M1, M2, M3. Hence u, = (u,, 2) by M5. 
(2.3) aD>b—-ac2D beifa =c,b = c by MB. 


Let .l. denote the L-lattice to which a belongs. Let ,u denote its unit element. 
Then we have 
(2.4) a qt = a. 

Proof. We have a = ,u since if a = z, then zu = x and wur = x by (2.2) 
and Ml. But then a = ,uz, and hence a = ,u by (2.1). Now ar = a(qut) 
= (a,u)x —~ a = a,u by M4 and M5. 

DEFINITION 2.3. b is said to divide a on the right if there is an element 2 ¢ 


> such that a = 2b. 
The element z of Definition 2.3 is unique by M5 and is called the quotient 


of a and b. 


7 In this and the remaining postulates the statements are assumed to hold if and only 
if all the products appearing in the statements exist. 








alence 


erties: 
unique 


d thus 


ement. 
y (2.2) 
a(quz) 
ent z¢ 


uotient 


nd only 





NON-COMMUTATIVE ARITHMETIC 273 


DEFINITION 2.4. If b divides [a, b] on the right, we write a © b and denote 
the quotient of [a, b] and b by a.b™’. 


THEOREM 2.2. The quotient a-b has the following properties: 
Rl. a> (a-b")b; 
R2. aD zrb-a-b'D xz. 

Proof. R1 is clear from Definition 2.4. Let a> zbso that [a, b] D> xb by M2. 
Then 

(a-b')b D> xb > (a-b", z)b = ((a-b')b, rb) = (a-b-')b. 

Hence (a-b', x) = a-b' by M5. 

Since R1 and R2 are the defining properties of the residual (Ward-Dilworth 


(1, 2]), we shall call a-b™ the residual of b with respect to a. It exists only if 
a@ b. 


We note that a S a and ba © a since [a, a] = u,a and [a, ba] = ba. Hence 
the residuals a-a‘ and ba-a' always exist. 


(2.5) a-a = U. 

(2.6) (ab)-b* = a. 

(2.7) aG@c,b@c-— a, b] Gc. 

(2.8) la, b].c’ = [a-c', b-c lifaSc,b Se. 
(2.9) a > bif and only if a-b' = w. 


3. The decomposition theory. Throughout this section we make the follow- 
ing assumptions: 


Al. a~boaeb- 
A2. = is modular. 
As consequences of Al we have 
(3.1) a@cbecmoa-c' Obdb-c". 
Proof. Since a-c' = c,b-c' = c, wehavea-c' © b-c' by Al. 
(3.2) a~b—aé6@ (a, b). 
(3.3) (ba)-c' = (b-(c-a’)')(a-c) ifb = a,c Casa Sec, ba Se. 
*Al is much stronger than necessary. However, since the weaker formulations are 
more complicated and artificial, and since the methods of proof remain essentially the 
same, we adopt the present formulation. If the proofs are examined, the various weaker 


conditions will be readily apparent to the reader (as, for example, the set given in §5). 
Also Al is always satisfied in the important instances of the theory. 











274 R. P. DILWORTH 


Proof. ((ba)-c™')e = (ba, c] = [ba, c, a] = ([ba, c]-a~')a = [b, c-a ‘Ja by (2.8). 
Now ba-a' © c-a“ by (3.1) and hence b © c-a" by (2.6). Hence 


((ba)-¢ *)e = ((b-(c-a')')(c-a"))a = (b-(c-a')')((c-a Ja) 
= (b-(c-a')')[a, c] = (b-(c-a') ")((a-e'Je) = ((b-(c-a")")(a-e je 


by M4 and Definition 2.4. 

Derrnition 3.1. If a’ = a-b', where a ~ b and (a,b) = 4u, we say that a’ 
is conjugate to a. 

DEFINITION 3.2. a is similar to b if there exists a chain of elements a = a, 
a, --+, @, = b such that either a; is conjugate to a;,; or a;;; is conjugate to 4; 
(Ore [1]}). 

The relation of similarity is clearly reflexive, symmetric, and transitive. 

DerFIniTIOn 3.3. An element pe is irreducible if p # ,u, and if x Dp, 
r~p—-2xr = puorzr = Pp. 


(3.4) If pis irreducible, then pp aand p~a-— (p,a) = ,u. 


(3.5) If p is irreducible and p > ab, p ~ b, p D b; then p’ D a, where p' is 
conjugate to p. 


Proof. Take p’ = p-b™. 

THEOREM 3.1. An element conjugate to an irreducible element is an irreducible 
element. 

Proof. Let p be an irreducible element, and let p’ = p-a', where p ~a 
and (p,a) = ,u. Letz Dp’ withr~p’. Thenz Dp.a‘andza D (p-a')a 
= [a, p] by (2.3), Definition 2.4. Hence za = (za, [a, p]) = [a, (xa, p)] by A2. 
Thus « = (xa)-a* = [a, (xa, p)]-a' = (xa, p)-a * by (2.6), (2.8), (2.9). Now 
a = d for some d by T1 and hence p’a = d, za = d by M6. Thus we have 


> 


ra ~ p'a = (p-a')a = [a, p]. But since p ~ a, [a, p] ~ p and hence za ~p. 


. . . . . ~! 
(xa, p) > p gives (xa, p) = ,u or p since p is irreducible; and hence z = ,u-a 
=U. = ,uorz = p-a =p’. This proves the theorem. 


THEOREM 3.2. If an irreducible p is conjugate to an element p’, then p’ is an 
irreducible. 

Proof. Let p = p’-a', where p’ ~ aand (p’, a) = ,-u, and let x D p’, z ~p" 
Then z-a D p’-a" by (2.8), Al and thus z-a' Dp. Also z-a' ~ p since 
z-a' = aandp =a. Hence we have either (i) r-a' = ,u or (ii) z-a | =? 
If (i) holds, then z-a’ = u, since p = a and hence x D a by (2.9). But 
x > p’ by hypothesis, hence x D (a, p’) = »u. And since zr ~ p’,r = »u. If 
(ii) holds, we have (r-a ‘Ja = (p’-a')a or [x, a] = [p’, a] by Definition 24. 
But then z > p’ > [z, a] and by A2, p’ = [z, (p’, a)] = [z, »u] = x. Henee 
either x = ,-u or x = p’, and thus p’ is an irreducible. This completes the proof. 


THEOREM 3.3. Every element similar to an irreducible element is irreducible. 


Proof. The theorem is clear from Theorems 3.1 and 3.2 and Definition 3.2. 











yn. 3.2. 





NON-COMMUTATIVE ARITHMETIC 275 


THeoreM 3.4. Let a’ be conjugate to a = aya,-, --- Q2a,, then a’ = ayayy 
ae asa, , where a, is conjugate to a; . 

Proof. Suppose the theorem is true for every product of k — 1 elements 
and let a’ = a-b',a ~ b, (a,b) = wu. Thena Sb, a, Ob, b Ga; sincea ~ b 
anda, ~ b. Thus a’ = ((a; --- a2)-(b-a;’)"’)(a;-b') by (3.3). Now a, ~ b 
and (a; , b) > (a, b) = su which gives (a; , b) = su. Hence a; = a,-b"' is con- 
jugate to a,. Let b’ = b-a;' ands = a ---a,. Then b’ ~s since b’ = a, 
ands = a. Now (s, b’)a, = (sa; , b’a:) = (a, (b-a;')a) = (a, [b, a]) = (ar, 
(b, a)] = [a: , au] = a; by Definition 2.4 and A2. Hence (s, b’) = a,-aj' = yu. 
Thus s-b’' is conjugate to s and hence s-b’' = ajay_, --- a2 by hypothesis. 
Substitution gives a’ = a,---a;. The theorem is therefore proved. 

We now prove the fundamental 


UniQuENEss THEOREM. If an element a « = has two representations as a prod- 
uct of irreducibles 


a = Prpr-1 +--+ Popr = VWeGs-1 +--+ Yoh, 
then r = s and the p’s and q’s are similar in pairs. 


Proof.’ Let 


(1) @ = Prpra-++ Pepi = WeGs-1--- G2. 


If p: = gq: , this factor may be canceled. If pi: ¥ q , let k be the first number 
such that gq: D peper--- Popi; then gq: D pea--- pi and pra---pri~ Hh. 
Hence (q1 , Pei --- Pi) = o,u by (3.4). But then qi- (pes --- pi)’ D pe and 
q: = q1- (Pes --- pi)’ is conjugate to q, and thus is an irreducible by Theorem 
3.2. Hence q; = pr. Now PePe-r--> Pr = (Qr-(pe-a--- pi) )pe-ra--- 11 
= [q., Per--- pil = ((per--- pi): )Q by Definition 2.4. Hence psp 
+ Di = Peas: pig by Theorem 3.4 and p; is conjugate to p;(i = 1,--- ,k— 1). 
Substituting this result in (1) and canceling q,, we may treat the resulting 
expression in the same manner. Thus we find r = s and the p’s and q’s similar 
in pairs. 

Concerning the existence of a decomposition into irreducibles we have the 

ExisTENCE THEOREM. [If the descending chain condition holds for the right 
factors of a # qu, then a has a decomposition into irreducible elements. 


Proof. If a is not an irreducible, then there is an element a; ~ ,u such that 
% Da,a,~aand a, ~ a. But then a = (a-a;')a; by Al. If a; is not an 
irreducible, we have an az ~ ,u such that az D a,, dg ~ a,, and az # a,. Then 
a = (a-a;')(a-az')az. Thus we get a chain of elements a C a; C a C --- 
which must break off giving an irreducible element p; such that a = bp,. But 
ifbis not an irreducible, b = byp2. Since p, D pop; > - - - is a descending chain 
of factors of a, it must break off giving a decomposition a = pxpe-1 --- Pop . 
The proof of the theorem is complete. 


*This proof is essentially that given by Ore [2] for non-commutative polynomials. 











276 R. P. DILWORTH 


We note that the descending chain condition for the factors of an element 
of = does not follow from the ascending chain condition in =, as in the com- 
mutative case. However, it does follow from the ascending chain condition in 
>’, where >’ is the lattice of left union and cross-cut if they exist. 

As examples of the abstract theory let = be the lattice N of a non-commutative 
polynomial domain. Then for every a and b, a = b anda © b so that T1-T4, 
M6 and Al are trivially satisfied. The relations of similarity and conjugacy 
are identical. Furthermore, in N the irreducible elements are those elements 
whose only right divisors are the elements themselves and the elements of the 
fundamental field. 

More generally the above results apply to any non-commutative domain of 
integrity having a Euclidean algorithm. 

Again if we interpret = to be the quotient lattice Q of a lattice L, we have 
A = B, where A = a;/a., B = b,/be if and only if az = b; in which case AB 
= a,/b.. Furthermore uz = a@;/a;, 4u = Q2/az. Postulates T1-T4 are 
clearly satisfied by the relation =, and it is readily verified that the multipli- 
cation satisfies MO00-M6. We have A © B if and only if az > be in which case 
A.B™ = [a,, b:]/b:. We observe that A ~ B if and only if az = be so that A; 
is satisfied. If we start with a modular lattice Z, then > is modular and A2 is 
satisfied. The irreducible elements are those quotients p for which pz covers p .” 


4. The arithmetic of a semigroup. Let S be a semigroup of elements a, }, ¢, 

- and unit element 7 such that each pair of elements a, b has a G. C. D. (a, d). 
Then if the ascending chain condition holds in S, a and b have an L. C. M. 
defined as the G. C. D. of those elements which both a and 6 divide. As in §2 
we define a-b-' = [a, b]/b. 

DeFINiTION 4.1. If a’ = a-b', where (a, b) = 7, we say that a’ is conjugate 
to a. 

DEFINITION 4.2. a is similar to b if there exists a chain of elements a =@, 

, a, = b such that either a; is conjugate to a;,; or a;,; is conjugate to a;. 
We have then the following fundamental theorem: 


TuHeoreM 4.1. Let S be a semigroup with G. C. D. and L. C. M. operations. 
(S is thus a lattice with respect to G. C. D. and L. C. M.) Then the following 


1 As another example, consider the set M of all finite matrices for which the number of 
rows is greater than or equal to the number of columns over a non-commutative ring 2 
with unit element. A subset A of M is called an ideal if the matrices of A (i) all have the 
same number of rows and the same number of columns, (ii) are closed under addition, 
and (iii) are closed under multiplication by all square matrices for which the product exists. 
We write A = B if the matrices of A have the same number of columns as the matrices of B 
have rows. The product of A and B is defined only if A = B and is the ideal generated 
by the products of the matrices of A with those of B. With a suitable definition of union 
and cross-cut the set = of ideals of M satisfies T1-T4, M00-M4, M6, A2. Moreover, if we 
give a similar definition of left ideals in M and R is a non-commutative domain of integrity 
for which every left ideal is principal, then the set = of left ideals of M satisfies T1-A2 and 
is an instance of our abstract theory. A detailed account of these systems will be givet 


in another paper. 








nent 
-om- 
yn in 


ative 
-T4, 
gacy 
rents 


f the 
in of 


have 
e AB 
| are 
tipli- 


1t10Ns. 
lowing 


aber of 
ring R 
ave the 
dition, 
exists. 
es of B 
nerated 
f union 
r, if we 
tegrity 
A2 and 
e gived 





NON-COMMUTATIVE ARITHMETIC 277 


three conditions are necessary and sufficient that each element not equal to i of S be 
expressible as a product of irreducibles unique up to similarity :"' 

(i) the ascending chain condition in S; 

(ii) the descending chain condition for the right factors of each element in S; 

(iii) the modular condition in S. 


Proof. The sufficiency of conditions (i)-(iii) follows from the results of §3. 
For since the product of any two elements always exists, T1-T4 are trivially 
satisfied. M00-M6 are readily verified and Al is trivially true since a © b 
for every a and b. (iii) gives A2. Hence the existence and uniqueness the- 
orems of §3 hold. 

Suppose now that each element not equal to 7 of S is uniquely (up to sim- 
ilarity) expressible as a product of irreducibles. We define p(z) = 0, p(a) = s 
if a = PsPs-.--- Popi. Then p(a) = 0 if and only if a = 7 and p(a) = 1 if 
and only if @ is an irreducible. Furthermore, p(ab) = p(a) + p(d) since if 
@ = Poa) +--+ pr and b = q,@) --- qi, then ab = pya) «++ Pip) --- G1 - Hence 
aDbanda # b implies that p(a) < p(b). It follows that the ascending chain 
condition holds in S and the descending chain condition holds for the factors 
of each element in S. 

We note that if a’ is similar to a, p(a’) = p(a). 

Let a and b be any two elements of S. We have then a = a,(a, b), b = b, 
(a,b), where (a; ,b;) = 7. Then 


[a,b] = [a,(a, b), b:(a, b)] = [a, , b:](a,b) = (a,-by')bi(a,b) = ajb,(a, b), 
where a; is similar to a,;. Then 
p([a, b]) = p(ai) + p(bi) + p((a,b)) = (ar) + p(b:) + p(a,d). 


But p(a) = p(ai1) + p((a, b)), so that p(a:) = p(a) — p((a,b)). Similarly, p(b;) 
= p(b) — p((a,b)). Hence p({a,b]) = p(a) + p(b) — p((a,b)) or p({a,b]) + p((a, b)) 
= p(a) + p(b). Thus pis a rank function over S in the sense of Birkhoff (Birk- 
hoff [1], p. 447) and S is modular by Birkhoff’s result. Hence conditions 
(i)-(iii) are satisfied. 


5. Properties of the L-lattices. Using the notations of §3, we make 

DeriniTIon 5.1. The unit elements of the L-lattices are called the units of >. 

Let now a;, a2 ¢ L, a;, az « L’, where L and L’ are any two L-lattices. Then 
if a, , dg = 2, @,, do = 22, we have (a; , a;) = (x1 , 2) and (ae, a;) = (x1, 22). 
Hence (a; , at) and (az, as) belong to the same L-lattice. We call this L-lattice 
to which all the unions of elements from L,; and Le respectively belong the 
union of L; and Le and write (L;, Lz). Ina similar manner we define the cross- 
cut [L,, Ls] of two L-lattices. Hence we make the L-lattices into a lattice 


v 


“;. =, will be modular if Z is modular. 


" By “‘up to similarity’? we mean that the irreducibles appearing in the decompositions 
of similar elements are similar in pairs. 











278 R. P. DILWORTH 


In general, the union of the unit elements of LZ; and Ly» will not be the unit 
element of (L;, Lz). However, we prove 


TueoreM 5.1. If the descending chain condition holds in =, then the units of 3 


are closed under union and cross-cut and form a lattice isomorphic to >, . 


Proof. We prove first a necessary lemma. 


Lemma 5.1. If the descending chain condition holds in =, then the only elements 
of = such that x = x are the units of >. 

Proof of lemma. We note that u = u for every unit u, since if u = z, then 
x = ux;and hence u = uby M6. Nowleta =a. Then the chaina, a’, a’, 
must break off by the descending chain condition so that a""" = a” ora” = u, 
by M5. But since a = a, u, = qu and hence a” = wu. We have then a”™’ 
= ,u-a = ,u, and finally a = ,u. 

We continue with the proof of the theorem. Let u and wu’ be two units, 
so that u = u,u’ = u’. Then (u, u’) = (u, u’) and [u, u’] = [u, u’]. Whence 
(u, u’) and [u, u’] are units by Lemma 5.1. This completes the proof of the 
theorem. 

The L-lattices have a number of interesting interrelations. We mention, 
however, only one: 


THeoreM 5.2. Let L be an arbitrary L-lattice and letle L. Then L has a sub- 
lattice isomorphic to L, with l as the unit element. 


Proof. Let x «Ll, and set up the correspondence z <> zl, where zl is clearly 
in L. Then (2, y) «> (x, y)l = (al, yl) and [z, y] < [z, yjl = [2l, yl]. Further- 
more, the correspondence is 1-1 by M5. Hence the theorem follows. 

We next characterize the irreducibles of = in terms of the lattice properties 
of the L-lattices. 

THeoreEM 5.3. The irreducibles of = are the divisor-free elements of the L-lattices. 

Proof. Let p be a divisor-free element of an L-lattice, and let z > p, z ~ p; 
then clearly z = ,u or x = p. Conversely, if p is an irreducible, let p’ be the 
divisor-free element of ,L dividing p. Then p’ > p, p’ ~ pand p’ # ,u, and 
hence p’ = p. 

We note that Theorem 5.3 may not hold if we weaken postulate Al. For 
example, let us replace Al by 


Bl. a@b, aG@c, b@cma-c' Ob-c". 
B2. aD»b and bJG@c-—aée. 
B3. a@b and a~b—aé€é(a, b). 


We define conjugate elements and irreducible elements by 
~ , =j 
Derinition 5.1. If a’ = a-b , where a~b,a © b,b © a, and (a, b) = at, 
we say that a’ is conjugate to a. 








anit 


of > 


ents 


then 


= Ua 


m—| 


nits, 
ence 


’ the 


tion, 
- sub- 


early 
ther- 


erties 


ttices. 


ye the 
, and 


For 


= gli, 





NON-COMMUTATIVE ARITHMETIC 279 


DeFINITION 5.2. pis an irreducible if p ¥ ,u andifx Dp,r~p,pOr- 
r= port = pu. 

With these definitions the proofs of the existence and uniqueness theorems 
follow, with some modifications as in §3. However, there may be irreducibles 
which are not divisor-free elements since we may have x > p, x # p, pu; ~ Pp, 
but z @ p is not true. 

We conclude this section with the investigation of the special case where each 
[lattice and its corresponding S-lattice are identical. Then = is an equiva- 
lence relation and the equivalence classes are multiplicatively closed sublattices. 


Leta = a, --- a; = b, --- b; be two decompositions of a. Then a = a,, and 
azb,. Buta,--- a, = a, anda,---ad3 =a. Hencea =a, >.--- = a 
= b, Secs & b; bi 


Thus this case reduces to that of Y closed under multiplication. 


6. The commutative case. In this section we investigate the consequences 
of assuming that the multiplication is commutative. Explicitly we assume 


AB. a =b—b <a,ab = ba. 

We have then 
(6.1) azbbs=c-a-re. 

Proof. Since a = b, ab exists and ab = c. But ab = ba = c, whence a = c¢. 
(6.2) a= a. 


Hence = is an equivalence relation giving equivalence classes {a}, {b}, --- - 
The L-lattices and the equivalence classes coincide. Furthermore, we note 
that each equivalence class is closed with respect to union, cross-cut, multi- 
plication and residuation. We note also that u, = qu. 


THEOREM 6.1. a’ is conjugate to a if and only if a’ = a. 
Proof. We prove first a series of lemmas. 
Lemma 6.1. Ifa~b~candb De, thena-c | Da-b". 


Proof. The residuals exist by Al. Furthermore, a > (a-b')b D (a-b)e 
~a-c' Da-b' by R1 and R2. 


Lemma 6.2. a~b~c—a.-(b,c)' = [a-b"',a-c"'). 


Proof. [a-b', a-c'] D a-(b, c)' by Lemma 6.1. But a D ((a-b'’)b, 
(a-c') c) D [a-b"', a-c"](b, c). Hence a-(b, c)"' D [a-b', a-c'] and thus 
a-(b, c)' = [a-b', a-c"). 


LemMa 6.3. a-qu =a-u,’ =a. 


and = 3 
Proof. a D aug a-u,' Da. Buta = au, D (a-u,')a — a Da-uz'. 
«of 4° 
Hence a = a-uz'. Since ue = au, the lemma follows. 











280 R. P. DILWORTH 


=a-b',aw~b, (a, b) 


We continue with the proof of the theorem. Let a’ 
= ,u. Then 


a=a-.u' =a-(a,b)" = [a-a",a-b'] = [,.u,a-b'] = a-b' =a’ 


by Lemmas 6.2 and 6.3. The proof of the theorem is complete. 
any irreducible factor of an element belongs to the same 


Now obviously 
Furthermore, since multiplication is 


equivalence class as the element itself. 
commutative, the ascending chain condition implies the descending chain 


condition for the factors of an element ae. Hence by the uniqueness and 
> is uniquely 


existence theorems and Theorem 6.1 each element not a unit in 2 
expressible as a product of irreducibles, the irreducibles belonging to the same 
equivalence class. Thus considered as a lattice, each equivalence class is a 
direct product of chain lattices; i.e., an arithmetical lattice (Ward [3)). 


REFERENCES 
GaRRETT BIRKHOFF. 
1. On combination of subalgebras, Proc. Cambridge Phil. Soc., vol. 29(1933), pp. 441464. 


A. H. Cuirrorp. 
1. Arithmetic and ideal theory of abstract multiplication, Bull. Am. Math. Soc., vol. 
40(1934), pp. 326-330. 


J. K6nie. 
1. Algebraischen Grossen, Leipzig, 1903, Chapter I. 


O. ORE. 


1. Abstract algebra, 1, II, Annals of Math., vol. 36(1935), pp. 406-437; vol. 37(1936), 
pp. 265-292. 
2. Theory of non-commutative polynomials, Annals of Math., vol. 34(1933), pp. 480-508. 
M. Warp. 


1. Postulates for an abstract arithmetic, Proc. Natl. Acad. Sci., vol. 14(1928), pp. 907-911. 

2. Conditions for factorization in a set closed under a single operation, Annals of Math., 
vol. 36(1933), pp. 36-39. 

3. Structure residuation, Annals of Math., vol. 39 (1938), pp. 558-568. 


M. Warp anp R. P. Ditwortu. 
1. Residuated lattices, Proc. Natl. Acad. Sci., vol. 24(1938), pp. 162-165. 
2. Residuated lattices, to appear in the Trans. of the Amer. Math. Soc. 


J. H. M. WEDDERBURN. 
1. Non-commutative domains of integrity, Journal fiir Math., vol. 167(1931), pp. 129-141. 


CALIFORNIA INSTITUTE OF TECHNOLOGY. 











a, b) 


same 
on is 
chain 
; and 
quely 
same 
sis a 


1-464. 


., vol. 


(1936), 


80-508. 


07-911. 
Math., 


29-141. 








ASYMPTOTIC FORMS FOR A GENERAL CLASS OF HYPERGEO- 
METRIC FUNCTIONS WITH APPLICATIONS TO THE 
GENERALIZED LEGENDRE FUNCTIONS 


By GreorcE E. ALBERT 
|. Introduction. The classical differential equation of Jacobi [5]' 
(l) (l— 2 )y” + (B —a— (a+ B+ 2)zly’ + rv +at+B6t lyy =0 
issolved by the pair of hypergeometric functions (to be designated as the Jacobi 
functions) 
(Yio? (2) = Fv tatBt1, —y;a4+ 1; 1 — 2), 
@ ( ¥i"@ =e -perr 
‘Fvu+atBt+i,v+B4+1;2+a+ 6+ 2;2/(1 —2z)). 


In the following pages forms will be derived for the Jacobi functions (2) which 
are asymptotic with respect to the large parameter v. 

The Legendre functions of complex degree, order, and argument are defined 
interms of the Jacobi functions (2) by the formulas 


[ pp) = __! (e+ 4y isidiig: 

(3) ) si rl—yw)\e-1 Yr" @), 

. ) 
) _ ae Te + 1TH +n +1) (2+ iJ aie 
| 2Q,(2) é r(2v + 2) ey } 7 (z); 


se Hobson [3] or [4]. In virtue of these relations between the two classes of 
functions, asymptotic forms will be at hand for the Legendre functions for 
values of | y | which are large in comparison with | u | , and conversely. 


I. The Jacobi functions 


2. The normalization of the differential equation. In the differential equation 
(1) the numbers v, a, and 8 will be subject to the blanket restrictions that | v | 
belarge and | a | , | 8 | be bounded; otherwise they are general complex numbers. 
The variable z will be allowed to range over the unbounded complex plane, cut 
along the axis of reals from the point z = 1 toz = — ©, with the exception of 
an arbitrarily small neighborhood of the point z = —1. The domain of z thus 
defined will be consistently designated by R.. In virtue of the known con- 
tinuation formulas for hypergeometric functions, the omission of any small 
ueighborhood of the point z = —1 involves no loss of generality in the results 
to be obtained. 


Received October 8, 1938; in revised form, March 22, 1939. 
‘Numbers in brackets refer to the bibliography at the end of the paper. 


281 











282 GEORGE E. ALBERT 


The transformations 
(4) z= 1+ 3s, 
(5) y(s) = 2° sP"1(g + 8?) u(s) 


change the differential equation (1) into the form 


f as 2 
(6) u'(s) + {p ¢ (s) + * — + x(a, B, »} u(s) = 0 
wherein 
( p = —vy+3#a+e+ I), 
| *(e) = 1,2)—1 
(7) { ¢(s) = 3(1 + js), 


| a +B — 14+ fa's 
[ xt, 8, s) = Si + 38? ; 

Under the transformation (4) a one-to-one correspondence is established 
between R, and a domain R, consisting in that half-plane of the variable s given 
by the inequalities 

—}tnx < args S }r, 


with the exception of certain small neighborhoods of the points s = +7 +/8. 

For the sake of definiteness the following agreements will be made: the root 
in (5) and the root ¢(s) will be chosen as those which have positive real values 
when s is positive real; the symbol a will denote that root of a* for which —}1 
< arg a S 47; the parameter p will, for convenience, be chosen so that —4x -3 
< argp S $n — 6,0 <5 < 3x? 

The coefficients ¢'(s) and x(a, 8, s) of the differential equation (6) are analytic, 
non-vanishing functions of s in the domain R,, and x(a, 8, s) is uniformly 
bounded with respect to the parameters a and £. 

The solutions of a differential equation of the type (6) have been derived by 
Langer [6] for large values of | p| and for values of the independent variable 
in a domain of the type R,. It follows that the equation (1) admits of known 
solutions for large values of | vy | when z remains within the domain R, . 

It is convenient to introduce the complex quantities 


|» = 2 | (1 + 4s") 'ds = log [z + (2 — 1)}), 
0 


: | £ = p®, 


in which the logarithm has its principal value. The domain R, is mapped bi 
uniquely upon the domain R¢ consisting of the half-strip of the &-plane specified 


by the relations 
—ar <I(*) Sz, —3n < arg ® S 3, 
2 Henceforth the symbol 6 will always denote such a number; usually it will be taket 


small. 





( 
Q 








plished 


$ given 


nalytie, 
‘iformly 


ived by 


variable 
F known 


pped be 
specified 


be taket 








ASYMPTOTIC FORMS AND HYPERGEOMETRIC FUNCTIONS 283 


from which some small neighborhoods of the points @ = +7 have been removed 
(it will be assumed that these neighborhoods are of a shape to make R» convex). 
The function #(z) is analytic over R, and non-vanishing except at the point 
¢= |. 

In the sequel use will be made of the solutions derived in [6] for the type 
equation (6). It must therefore be shown that a hypothesis imposed by Langer 
upon the equation is fulfilled in the present instance. It will be required that* 


be satisfied uniformly by some constant M along any straight path in Re upon 
which |s| 2 N,; > 0. For the case in hand it is found that 


es. af 1 -$ B — a 
-@- aren 8} erie 


O(s)ds _ ds db 
way = (3) +0(): 


and the fulfillment of the condition is immediate. In fact, it is satisfied uni- 
formly for all values of a and 6 in any preassigned bounded domain of those 
quantities. 


It follows that 


3. Asymptotic forms for the Jacobi functions. In [6] separate forms are 
given for the solutions of (6) according as the variable z is inside or outside a 
tirele of radius c/ | v | (c denotes some positive constant) about the point z = 1. 
This division of the values of z will be designated briefly by the relations | ¢ | < N 
and|é| > N, renpectively. 

The function Y{%'(z) given in (2) is easily identified in terms of a solution 
(2) of the equation (1) derived in [6], Theorem 1. The function y;(z) is 
uiquely characterized as that solution of the differential equation which, when 
R(a) > 0, approaches a constant as z approaches unity, to a higher order than 
any other solution. This property is evidently possessed by the Jacobi function 
ni (z). It follows that Y‘4-(z) and y;(z) differ only by a constant factor. 
‘imple computations involving the relation 


lim (2 — 1) *¢ = p 
z—l 
and the form given in [6] for the solution y;(z) when | — | < N lead to the formula 


) VP @ = 2" PPa + lily + Ha +6 + DIOP ne. 


The second Jacobi function Y!%$°’(z) given in (2) is uniquely characterized 
as that solution of (1) for which 


(10) lim gr tater ¥‘3'(2) = op teterl 


| 2] 


*See [6], pp. 400, 405 for the definition of the quantity 0(s). 








284 GEORGE E. ALBERT 


When | z| — ~, the variable é is in the region | | > N. For such values of ¢ 
the solutions given in [6] change as that variable crosses the boundaries of 
regions =” specified by the inequalities 


(11) =a”: (h-—1r+6 S argé S (h— 1)r — 6. 
By the equations (8) it is seen that 
(12) arg § = }r + arg[y + 3}(a +8 + 1)) + arg ® 


and, moreover, when |z| — ©, arg @— 0. It is thus evident that ¢|),)_.. is 
in the region =“ when —432 + 6 S arg [v + 3(a + 8 + 1)] S + —4, and in the 
region =” when —z — 6 < arg [vy + 3a + 8 + 1)] S 4x — 6. In [6], Theo 
rem 2, fundamental pairs of solutions y,,;(z) (j = 1, 2), for any integer h, are 
deduced. For values of £ in the region =“ these solutions are uniquely char- 


acterized by the relations 
( lim 2’tetPtty, (2) = 2 tO 1 + O()}, 


|z|-* 


lim z’yno(z) = 27°F 11 + OW )}. 


|z|-*e 


It is immediate that for large enough values of | v | , the Jacobi function is, to 
within a constant factor, the solution y,,:(z) with h = 0 or 1 according as £ |e 
isin =” or =”, respectively. However, yo(z) = y:,1(z) by definition ((6], (11)). 
It follows that 


(13) rl, 6° (2) = gr titet+t .. .(s) (1 + Oly yy, 
The substitution of the general forms given in [6] for the solutions y;(z) and 


yo,(z) into (9) and (13) respectively yields asymptotic forms for the two fune- 
tions under consideration which are valid 


((i) in the entire complex plane of z from which some small neighborhood of 
| the point z = 1 has been excluded and which has been cut by the relations 
(A) —nx <arg(z+1) S7; 
’ \ (ii) for values of v limited by the cut —x —5 < arg[v+3(a+8+ 1) 8 


w — 3b; 
| (iii) uniformly for all a and B in any preassigned bounded domain. 
The facts are incorporated in the theorem which follows. 
TuHeoreM I. Under the conditions (A), the Jacobi functions Y, «)(2) and 
Y,% aie admit of the representations, asymptotic as to v, given by the faneia 
TI (@) = BP Pla + Iiily + Ha + B+ 1) 
(z _ 1) eH (z + 1) 73° ot 7.) + Ril, 
r(e, $s ) gi gr titat iti Matt "ly + Ma + 3 4 1)}' 
(2 — 1g + IOP SHAD © + hal 


(14) 


if 





of § 
»s of 


=o is 
a the 
Theo- 
h, are 
char- 











ASYMPTOTIC FORMS AND HYPERGEOMETRIC FUNCTIONS 285 


in which the correction terms R; (i = 1, 2) have the forms 
(e***? O(y*) for |z-—1| Se/|»|, 
1 = le t coe. 
Ne "ow e Ow )} for |z—1|>c/|r], 


? 


and 
(¢ "Ov ') if a0, for |z—1| Sc/|»|, 
Re = (log 900" if a=0, for |z—1| Sc/|r|, 
(Ri for |z-—1]>c/|»|. 


The symbols J.(¢) and H{’’() denote the Bessel functions of the first and third 
kinds respectively. The symbol c denotes some fixed positive constant; the variables 
tand ® are as given explicitly in (8). 

For values of z such that | z — 1 | > c/| v| the formulas (14) may be replaced 
by the alternative forms given in terms of elementary functions, [6]. These 
forms depend upon the region =" of the variable ¢. Recalling (12) and the 
restrictions upon p and %, we see that the total range of arg & is given by 
-s-—6< argt < + — 6. The regions =” with indices h = —1, 0, 1 cover 
this range entirely. It is found that when |z — 1| > c/| v | the Jacobi functions 
admit of the forms 


o (2) = IPO Pa + Dy + Hae + 6+ DOME — 1) 
(15) (2 + 1) 79 faith e® + afl? ey, 
ly’ r(a, = °B) (2) = Daal }(a+B+1) (z 1) }(a+}) (z + 1)" 4(B+}) fa i eit + as) e**} , 


. . h . me (h) * ° 
with coefficients a;'; dependent upon the region =“ of the variable — as shown in 
Table 1." 


TABLE | 
h ay) ay, ay") ai") 
1 | [estbeq (1 Tie 0 
0 | fetty ie eS Sse 
=e [| eet) | eee 


The representations (14) and (15) above are analytic functions of the vari- 
able z. 

The formulas (15) with the coefficients a{”) as stated in Table I for the 
regions = and =” of the variable — were given by Watson [8]. He applied 
the well known method of steepest descents to two hypergeometric functions 


‘Henceforth the symbol [EZ] will always denote a function of the form [E] = E + 


Ot) O(E”). 








286 GEORGE E. ALBERT 


which are equivalent to the functions (2). Simple algebraic manipulations 
suffice to reduce his formulas to the above indicated cases of (15). His state. 
ments of the regions of validity for the results are the same as those to be 
found in the special cases (b) and (c) of §4 here. 


4. The regions of validity. The asymptotic forms (15) change as the variable 


£ crosses the boundaries of the regions =”. It is to be noted that any two 
regions for consecutive values of h overlap in the major part of an upper or 
lower half-plane. In this common domain the apparently different forms are 
asymptotically equivalent. It follows that the actual line across which any 
two forms interchange may be placed arbitrarily within the common half- 
plane of their validity. 

While the relations (11) deseribe the regions =” in all their generality, it is 
desirable to rephrase their content in terms of the variations of z or v when 
one of these quantities is fixed. 

When arg ® is fixed, the boundaries of the regions =“ (as seen in the plane 
of v) are radial lines issuing from the point vy = —}(a@ + 8B + 1) which divide 
the »v-plane into sectors. Of particular importance are the following special 
cases: 

(a) When | z| is large, the entire right half-plane of v is contained within 
=° and = simultaneously while the upper and the lower half-planes are 
contained within =“” and =’, respectively. 

(b) For any value of z such that 7(z) 2 0, the admissible values of » are 


pee (1) 


° e ww (0) é 
distributed between ="’ and = as follows: 
pe (1 
=": —3rn +6 S argv Sr — 5, 


=. —r—i<argy S —6. 


(c) For any value of z such that 7(z) < O the important range | ar 
fy + Ha + B+ 1)}| S + — 4 is portioned between =" and =’ by the 
relations’ 


(1) 
-"s 6 wr — 6, 


IIA 
IIA 


arg v 
a. —xr +6 argv < 3r — 6. 


When arg » is fixed, the boundaries of the regions =” as seen in R, are the 
curvilinear ares represented by arg ® = constant. In general these curves 
will leave the point z = 1 at the angle 2 arg ® with the positive axis of reals 
and bend round in such a way as to cut the negative axis of reals in a point 
whose modulus is greater than unity. The ares resemble logarithmic spirals 
In general the entire domain R, will be contained within two of the regions 
=" (h = —1, 0,1). The following cases are of especial importance. 

(d) When » is a positive real number, the entire domain R, with the excep 
tion of a small convex region enclosing the real interval —1 S z < 1 is com 

5 Since | » | is very large relative to | a | and | 8/, one has arg (» + }(@+6+1))~ 
arg v. 





tions 
tate- 


0 be 


‘lable 
r two 
er or 
iS are 
| any 


half- 


, it is 
when 


plane 
divide 
pecial 


within 
PS are 


v are 


> | alg 
by the 


are the 
curves 
of reals 
4 point 
spirals. 
regions 


excep 
is con- 


+ 1))~ 











ASYMPTOTIC FORMS AND HYPERGEOMETRIC FUNCTIONS 287 


° . . ome ome (1 mr 
tained within both =’ and =’. The upper and lower half-planes are con- 
° . . pee (1) me (O) ° 
tained within =" and =’, respectively. 
y . ome 1 
(e) When 0 S argv S + — 4, —€ may be taken in =", and when —r + 6 S 


argv < 0, € may be taken in =” 


5. The Jacobi polynomials. The classical Jacobi polynomials are defined by 
the formula 
Pez) = I'(n + a + 1) 
(n + 1)l(a@ + 1) 
in which n is a positive integer [5]. Making use of the asymptotic formula 
for the gamma function and the note (d) of the previous section, one has 


Fin+a+B6+4+1, —n;a +1; 3(1 — 2)) 


CoroLiary I. Asymptotic forms for the Jacobi polynomials are obtained from 
the first formulas in (14) and (15) and the relation 


P\* (2) ~ ~ y‘7;"'(z). 
(z) Ia + 1) "i (2) 
Whsn (15) is used, the variable & is confined to the regions =” (h = 0, 1). 


Imposing the restrictions @ and @ real and positive, —1 < z < 1 withz = 
cos @ upon the forms obtained from the corollary, one obtains the classical 
result of Darboux [2] and the more recent result of Szegé [7]. 


II. The generalized Legendre functions 

6. Asymptotic forms for P?(z) and Q?(z) when | v| is large. The formulas 
(3) constitute the most general definitions for the Legendre functions. For 
the special case in which z is confined to the real interval —1 S z S 1 it is 
customary to set z = cos @ (0 S @ S 7) and to redefine the functions by means 
of the formulas 
( P*(cos 6) = e”™' P*(cos 6 + 0-1), 
Qe" Qi(cos 0) = & MO (cos 6 + 0-7) + ec" Qt(cos 6 — 0-1). 


For general values of z it is known that 


(16) 


7 | Px—2) =e?’ Pz) — : sin (v + w)re “" Q(z), 
(li 


loses) = —e*'"" Q(z), 
in which the upper or the lower signs are to be taken according as I(z) 2 0. 
When z = cos @, these relations are replaced by 
'P*(—cos 6) = P*¥(cos 6) cos (v + w)r — 2 OF(cos 6) sin (v + u)z, 
vs 
(17') 
| Q?(—cos 6) 


—Q*(cos 6) cos (v + uw) + 5 Pr(cos 6) sin (v + y)e. 








288 GEORGE E. ALBERT 


These four identities allow the limitation of the variable z to its right half- 


plane. However, the restriction will not be imposed except where greater 


simplicity results. 
From the formulas (3) and (14) one obtains the representations 


P 1\#} 
(pace) = er UTE iy + YO} + Ril, 


(27—1)# 
{ ute too Te + IPO +u4+1) © +3)! 
1 \Or(z) = Mut bo? oo YF ae Se \ 2 
us) axis < P(2» + 2) (—1)! 
| {HO {i + 4)®} + Ri, 
in which the correction terms R; (¢ = 1, 2) are as given in Theorem I with the 
substitution a = —y and the variables ¢ and ® are explicitly 


t=i(v+4)d, © = log {z + (2 — 1)'}. 


Similarly, when | £| > N, the formulas (3) and (15) lead to the representations 


[r 
(19) v v h 3 (h) i 
[Qte) = 2h Oe ay Wet + Re 


(hk) - ty 
, 


(20) 4 + a) 42? — 1) fdr e* + dive 


’ 


ome (h) 


in which the coefficients b\") are dependent upon the region =” of the variable 
and are obtained from the a$”) given in Table I by the substitution a = —p. 
if y 


TueoreM II. The representations, asymptotic as to v, for the generalized 
Legendre functions P?(z) and Qi(z), valid when —a7 < arg (2 + 1) S 7 and 
—x —6 < argv S a — 5, are given in general by the formulas (18). For values 
of z such that |z — 1| > c/| v|, the forms (19) hold. The forms (18) and (19) 
are valid uniformly for all values of u in any bounded domain. 


Important specializations of the forms (18) are obtained upon applying the 
definitions (16). The notation (cos @ + 0-7) of (16) implies the limiting values: 
© = +76, (cos’ 6 — 1) * = e*!*(sin 6) in which the upper or the lower 
signs are to be taken together. Easy calculations, if the identities 
J_,(se™") = e&*"'J_,(s), 2i2Y_,(s) = H°&2(s) — H%(s) between Bessel fune 


tions are employed, lead to the representations 


6 


4 
, Jaa + 46} + Ri, 
sin 6 


'P#*(cos 0) = (v + »*( 
(20) , 


|@r(cos 6) 


—*(v + ”( Z J ¥.t6 + 3)0} + Re, 
2 sin 6 


> =a 


2 


in 
DOs 
hay 











half- 


eater 


- Ri}, 


th the 


ations 


ariable 


= —p. 


pralized 

7 and 
r values 
nd (19) 


ing the 
values: 
e lower 
entities 
>| fune- 





ASYMPTOTIC FORMS AND HYPERGEOMETRIC FUNCTIONS 289 


° . / . 
in which the correction terms R; (¢ = 1, 2), have the forms 


e*O(1) for 0 <6 c/|r|, 
Ri =) €° Ov?) +e” Ov") for @ > c/|v| and for general complex values of », 
| Ow) for 6 > c/|v| and real values of »; 
(@-O(v*”") if » #Oand0 <0 <c/|r|, 


= {(log 0)OW") if » =Oand0 <¢@ 


IIA 
° 


[Ri otherwise. 


The forms (20) are valid uniformly in u for values in any bounded domain and 
for 0 in the indicated portions of the range 0 <6 S r — 6. 

A similar treatment of the formulas (19) leads to the classical asymptotic 
formulas for the functions under consideration. 


7. Asymptotic forms for P?(z) and Q?(z) when 41 | is large. Utilizing the 
transformation 


(? — 1) — 1) = 


Whipple [9] has obtained relations between the Legendre functions of degree » 
and order » and those of degree —u — } and order —v — 3} which hold when 
R(z) 2 0. Written in forms convenient for the purposes of the present deduc- 
tions, these relations are 


—}(v+4) 
P>*(z) = (Qn)? Pu + 2) @ — 1) ( ** yeas ©, 


ws) | T(2u + 1) t—1 
{ < 
ri l t PO tetas 
Qi(z) = (4n)he" ees de - 1)* ‘G+ nf 7 yicty >), 
3 


(v+}, 


inwhich the symbols Y,,"4*7"” (t) are specializations of the Jacobi functions (2). 
The relations (21) tagether with the well known identity 


a rv +4 +1) 
(22 DB (2) = . 
) P3(z) = pees oy 


will furnish complete asymptotic representations for the Legendre functions 
PX(z) and Q#(z) when |u| is large and | v| is bounded. 

In substituting the asymptotic forms for the Jacobi functions into the rela- 
tions (21) the variables @ and ¢ will be replaced by the quantities 


23) ® = log (: + 4} E = ip; 


P,*(z) + 2 veri sin ux Q;(z) 
T 


in the first of these formulas the root is to be chosen as that which has a real 
positive value when z is real and greater than unity, and the logarithm is to 
have its principal value. 











290 GEORGE E. ALBERT 
Substituting (14) into (21), one obtains the representations 


7 ee ? 
| P>*(z) _ 2 b (etd idet B (u + D) BH (ind) + Ri}, 


(24) { P(2u + 1) 
| Q(z) = (5) rent fet 1) {J ,44 (ip) + Ro}, 
in which the correction terms R; (¢ = 1, 2) have the structural forms 
(@°'Om’") if » +440, for |z| Zelul, 
R, = <((log u®)O(') if » +34 =0, for |z| 2elu 
[Se*Ou) +e" OW} for |z| <elul, 
and 


R, =; ihe . 
OR, for j|2| <ejp}. 


The symbol ¢ denotes some positive constant and the quantity ® is given 
explicitly in (23). 

If the identity (22) is used, it is convenient to restrict the parameter 4 
hy the relation | arg u| S + — 6 < =z in order that use may be made of the 
asymptotic formula for the gamma function. In virtue of the results obtained 
for the function P,“(z) such a restriction involves no loss of generality. One 
obtains the representation 

Pe(z) = €*u B (sin vr)e PY ASP (i) 


(25) . v+4) hei (2) 7° = 

+ (sin pre "Heyy (iuh) + Bi 
in which the correction term R; has the forms 

‘sin (u — v)e O(u*) + sinur O(u”*) if v+4+0, for |z| 2clu), 


R, <3 sin (u — v)rlog [1 + O(u ]O(u") + sin ur O(n) 
4 if v+}=0, for |z| 2 cla} 





ls 1O(u~?) {sin (u — v)we ** + sin wx et?) for |z| <cjp. 


A similar treatment involving the use of the asymptotic forms (15) in the 
relations (21) leads to representations in terms of elementary functions which 
are valid when |z| <cju|. The results are 


( ™ 2° T(u + 3) ( (hy) [2 — + (h) [2 +1 i) 
>—# no } »(h) \ 
| P, (z) - a (Qu 1) oon 2+1 + Ci,2 aot f? 


(26) { f 1 hu +1] ju) 
| Qr(e) = Be wT + +1) yen (: ) + cr (: — ') f 


° ° ° h . mR 
in which the coefficients c;"; are as shown in Table II. 





; given 


eter 4 
of the 
ptained 

One 


< Clb} 


) in the 
1s whieh 








ASYMPTOTIC FORMS AND HYPERGEOMETRIC FUNCTIONS 291 





TABLE II 
h ef) cy? coy C32 
ad] oa 0 [-e"" he 
> wl. a a 
ee a ee ee ee ee 


When the forms (26) are substituted into the identity (22), it should be noted 
that the relation | argu» | S w — 6 < 6 allows the restriction of the vari- 
able — to the pair of regions =’, fork = 0,1. Easy computations lead to the 
formula 


? , = 2 j - y i + 7 7 “ _ (: am  } 
(27) P*(z) = (2) « e {tin un (? ae [(sin var )e ] ak i 


in which the upper or the lower sign is to be taken according as the variable = 
7 . me (1) we (0) 
isin the region = or =’. 


THeorEM III. The representations, asymptotic as to yu, for the generalized 
Legendre functions P>“(z) and Q(z) are given for general values of z by (24) and 
(25). For values of z such that |z| < c¢| uw | the formulas (26) and (27) apply. 

The forms are uniform with respect to v for v in any fixed bounded domain. The 
variable z ranges over the domain specified by R(z) 2 0 and | arg (2 +1)| S37 
The range of arg u for (24) and (26) is given by —x — 6 < argu S mr — band 
for (25) and (27) by |argu| Sw —6 <z. 


Correct forms for P?(cos 6) and Q?(cos @), asymptotic as to yu, apparently 
have never been given. Barnes [1] published erroneous results. It is found 
from (27) and the second formula of (26) that the forms 


j 
P*(cos 0) = (?) e “nu? {[sin ux](cot $6)" — [sin vx](tan’36)"}, 
us 


j 
QF (cos 6) = (5) e “un * {[cos ux](cot $0) — [cos vx](tan $0)*} 
are valid when | p | and 
'<és rT. 

A few remarks concerning the regions = 
27) may be of assistance to the reader. The regions are given by (11) with = 
replacing ¢. The domain Rj is described by the inequalities: —}7 < I(¢) < 
it, —}9 S arg @ < 4n. Since arg § = 47 + argu + arg &, it is seen that, 


. . oath F 
lor a fixed value of arg u, any boundary line of a region =” which appears 


is large in comparison with | v | largu| Sm — 5, 


” of validity for the forms (26) and 


‘plicitly in Rg will be a radial line extending from the origin at some angle 
The corresponding 


inclination a, —}4 < a < 4x, with the axis of reals. 











292 GEORGE E. ALBERT 


curve in the z-plane will depart from a point on the real axis between 0 and | 
and will have the inclination —a. As | z| increases, the boundary line in 
question bends toward and finally becomes asymptotic to the line arg (z — 1) 
—a. With these few facts, the reader may easily translate the discussion 0 
$4 to suit the case in hand. 


~ 


BIBLIOGRAPHY 


1. E. W. Barngs, On the generalized Legendre functions, Quar. Jour. of Math., vol. 39(1908), 
p. 97. 
. G. Darsovux, Sur l’approximation des fonctions, etc., Liouv. Jour., (3), vol. 4(1878), 
p. 377. 
3. E. W. Hopson, On a general type of spherical harmonic, etc., Phil. Trans. Royal Soe. of 
London, vol. 97(1896). 
4. E. W. Hopson, Spherical and Ellipsoidal Harmonics, Cambridge, England, 1931. 
5. C. G. J. Jacosi, Untersuchungen tiber die Differentialgleichung der hypergeometrischen 
Reihe, Werke, vol. 6, p. 184, or Jour. fiir Math., vol. 56(1859), p. 149. 
6. R. E. Lancer, On the asymptotic solutions of ordinary differential equations, etc., Trans. 
Amer. Math. Soc., vol. 37(1935), p. 397. 
7. G. Szecé, Asymptotische Entwicklungen der Jacobischen Polynome, Halle, 1933. 
8. G. N. Watson, Asymptotic expansions of hypergeometric functions, Trans. Cambridge 
Phil. Soc., vol. 22. 
9. F. J. W. Wuiprte, A symmetric relation between Legendre’s functions, Proc. London 
Math. Soc., (2), vol. 16(1917). 


bt 


Outro STATE UNIVERSITY 


an 











PRESERVATION OF PARTIAL LIMITS IN MULTIPLE SEQUENCE 
TRANSFORMATIONS 


By Hucu J. HAmILtTon 


1. Introduction. Problems (iii) and (iv) in §1.4 of’ H, were solved in that 
paper only for s,2 = 0. It is the purpose of the present paper to give complete 
solutions. Existence of the transform o,» is assumed for each m. 


2. Additional notations and definitions. Let X and Y denote classes of 
sequences. The notation X — Y row reg shall signify row regularity of the 
transformation X — Y in case each of X and Y is the class of regularly con- 
vergent sequences, and ultimate row regularity in all other cases in question. 
(See §1.4 of H,.) Thus’ NS RC — RC row reg will mean NS RC — RC with 
o = s2 for all k*. The notation NS RC — RC ul row reg shall mean NS 
RC > RC with ox: = s,2 for all k sufficiently large. 

Consider the matrix || bmx || , where bmi = Gms (kK A m), and Dnnm = Gmm — 1. 
Define the sequence {7r»} by the equations 


oo 
Ta = > Dink Si = Om — Sm 
k=] 
andlet NS X-* — Y denote NS {7,,} be of class Y whenever {s,} is of class X. 
t { 


3. List of theorems (first form). The following theorems are obvious. 

NS RC, NS BURC, NS URC — URC row reg are, respectively, NS RC, NS 
BURC, NS URC-* — URCRN. 

NS RC, NS BURC — BURC row reg are, respectively, NS RC, NS BURC-* 
+ BURCRN. 

NS URC — BURC row reg are NS URC — B and NS URC-* — URCRN. 

NS RC — RC ul row reg are NS _RC-* — RCURN. 

NS BURC, NS URC — RC row reg are, respectively, NS BURC, NS URC 
~+ RC, and, respectively, NS BURC, NS URC-* — URCRN. 

NS RC — RC row reg are NS RC-* — RCRN. 


Received October 28, 1938. 

'H, will denote the author’s paper, Transformations of multiple sequences, this Journal, 
vol. 2(1936), pp. 29-60. The present paper assumes familiarity with the contents of H., 
the ideas, terminology, notations, and results of which are freely used without further 
comment. 

*NS shall abbreviate conditions necessary and sufficient that. 


293 











294 HUGH J. HAMILTON 


4. List of theorems (second form). Denote each condition listed in §3 of H,, 
as applied to the matrix || b,.4 ||, by the label there attached to it, with a prime. 
(Thus (a;)’, (ae)’, ete.) It is then possible to express the theorems of §3 in a 
fashion typified by the following two particular examples. 

NS RC — URC row reg are (a;)’, (b:)’, (di)’, (de)’, (ds)’, (@:)’, (@2)’, (8s)’. 

NS URC — BURC row reg are (¢;), (2), (a1)’, (ae)’, (bi)’, (be)’, (di)’, (d,)’, 
(&)’, (@2)’, (s)’. 


5. Reductions of conditions. ‘These sets of conditions may be more elegantly 
expressed by reducing the conditions (a;)’, (a2)’, ete., to simpler conditions on 
the matrix |! a, ||. For the most part the latter occur among those listed in 
§3 of H,. However, the following are also needed. 


Lintks = 0 for k® # m’', kb’ ¥ m": 


ve} fen), with Linus = 1 fork® = m' ork’ = m". 
{és} (es), With Lai = 1. 

“ ; Links = Ofork® # m'. k* ¥ m": 
a i Laws = 1 fork’ = m' ork’ = m". 
{f3} (fs), with Las = 1. 


In the list of equivalences, which follows, note that the conditions in the 
first paragraph involve the mere existence of limits, bounds, ete. 

(a1)’, (a2)’, (bi)’, (be)’, (cx)’, (di)’, (de)’, (ds)’, (da)’, Ces)’, (&)’, (f1)’, (f2)’, (hy, 
(f,;)’ are equivalent, respectively, to (a:), (a2), (bi), (be), (ex), (di), (de), (ds), 
(d,), (ef), (@:), (f1), (fe), (fs), (f). 

(&2)’, (&)’, (f2)’, (f3)’ are equivalent, respectively, to {@2}, {83}, {fe}, {fs}. 

The equivalences in the first paragraph are obvious. Proof will now be given 
for the first equivalence in the second paragraph. 

The condition 


lim ba bus = 0 form' > E (ke = 1,2,-" 
mi=o ki=l 

becomes 
lim >> anc =0 form'>E (k* = 1,2, -*°), 


mia ktm] 


excepting cases in which the pasitions of the elements of k° occur among those 
1 . . . . 
of m’ and corresponding values coincide, and in these cases becomes 


io) 
. 1 "Al 
lim p 2 Qnk = 1 form > E. 


mince ki=l 


6. List of theorems (final form). Finally, in view of (.08)—(.11) of §4 in Kh, 
the sought results may be tabulated as follows. 











H,, 
me, 
in a 


ds)’, 


ntly 
Ss on 
d in 


- those 


in Hi, 











PARTIAL LIMITS AND SEQUENCE TRANSFORMATIONS 295 
1. NS RC — URC row reg are (a1), (bi), (di), (de), (ds), (&:), {2}, {8s}. 
2. NS BURC — URC row reg are (a1), (b:), (ds), (da), (es), (@1), {82}, {&s}. 


3. NS URC — URC row reg are (a1), (az), (bi), (be), (di), (ds), (&1), {82}, {8s}. 


5. NS BURC — BURC row reg are (1), (ds), (da), (e¢), (#1), {82}, {8s}. 

6. NS URC — BURC row reg are (cx), (¢2), (di), (ds), (@:), {2}, {&s}. 

7. NS RC — RC ul row reg are (¢:), (di), (de), (ds), (@:), {G2}, {@s}, (fi), (fe), (fs). 
8. NS BURC — RC row reg are (c:), (ds), (da), (e1), (#), {82}, {8s}, (fa), (fa). 
9. NS URC — RC row reg are (¢1), (€2), (di), (ds), (@:), {€2}, {@s}, (fr), (fs). 

10. NS RC — RC row reg are (c:), (di), (de), (ds), (f:), {fe}, {fs}. 


7. A verification of sufficiency. Checks for these results are available in 
the formulas for row limits given in H,;. Thus, for example, formula (46.1), 
p. 51, of H, , and 1 above yield, for r' sufficiently large, 


on — 8 = Dy’ (3 — 8) {1 - i (—1)"" QU” i}, 


where 7’ is the dimension of r’, p is that of k*, and 7 is that of k"; >>’ sums over 
all k* whose elements are included among those of r', and >>” sums over all k" 
whose elements are likewise among those of r’. The part of the expansion 
of the right side corresponding to k* = r' iss, — s. That corresponding to any 
other k* is 


[,  S_ ayn (T - 2 S(T -0 

(ss — 8)<1 — D> (-1) = (43 — s) >| (-1) = 0. 
\ r=1 T ) 7=0 T 

Hence Ori = Spl. 


8. Application to double sequences. As application, 10 yields the following 
st of conditions on the four-dimensional matrix || @pqi; || necessary and sufficient 
that the double sequence {¢,,} be re with all row and column limits the same 


wo 
as those of {s;;} whenever |s;;} is re, where op, = > ApeijSi; - 


i,j=1 
NS RC — RC row reg are: 


oo 


(e,) > | pais | < A (p, 9 = 3 a. Fe 


i,j=l 





296 


(di) 


(ds) 


(ds) 


(fi) 


tfe} 


{fs} 


HUGH J. HAMILTON 


lim @pgij = Gj 
P.q=ea2 


lim Yaya = J; 


p.q=e t= 


lim > Gp = Li 


P.q=e2 = 


wo 
lim > Api = L 


p.q=eo i,j=l 


lim api; = 0 forall gq 


p=e 


lim agi; = 0 forall p 


q=e 


lim > Apgii = 0 forallg 
-,q-—l1qaqti1,qt+2,°-:), 


p=e t= 


oO 


lim >. Qyoig = 1 forall gq, 


p=e i=l 


im Dd ayeii = 0 forall gq 


lim > Apgij = 9 forall p 


q=e i= 


lim > Angi = 0 forall p 


eo 


(iG=1,2,---, 


lim > Apqpi = 1 forallp; 


q=a2 )= 


lim Do aye; = 1 for all q, 


p=eo i,j=l 


lim >> apeii = 1 for all p. 


q=0 i,j=1 


9. Implications of the new conditions. Finally, 
{as}, {fo}, and {f5}, to wit: {@} — (e2), {&} — @), 


{f2} — (fe), {fs} — (fs), {fo} — {&2}, {fs} — {8}, (de) + (& " Ione (de), (ds) + {al 


cations of conditions {é@2}, 





(i,j = 1,2, --+), 


Il 
— 
< 


G 


»>—-1,pt+1,p +2, °°’), 


we may note certain impl- 














impli- 
—> (es), 


+ {al 











PARTIAL LIMITS AND SEQUENCE TRANSFORMATIONS 297 


— (ds) with LZ = 1. (Compare .75, p. 41, of Hi.) With these relations and 
those on pp. 37-41 of H, , the sets of conditions necessary and sufficient for the 
various transformations can be materially strengthened. For example, NS RC 
— RC row reg may be written: (¢:), (di), (de), (ds) with L = 1, (f,), {fe}, {fs}. 
Thus in the above application to double sequences a;;, L;, and L may be 
replaced by 0 (7, 7 = 1, 2,--- ), and L by 1. 


PoMONA COLLEGE. 








CONVERGENCE THEOREMS FOR CONTINUED FRACTIONS 
By WALTER LEIGHTON 


1. Introduction. The purpose of this paper is to present a new set of con- 
vergence theorems for continued fractions of the form 


a, a2 a3 
(1.1) I+ 75 > 4 ¢ oo 
where the a, are-complex numbers # 0. The method used is an extension of a 
method used in an earlier paper (Leighton [1]') the results of which now follow 
from Theorem 4.1 of the present paper. 

A number of writers’ proved independently that if | a, | < } (nm = 2,3, 4, ---), 
the continued fraction (1.1) converges. Szdsz [1] showed that the constant } 
cannot be improved by proving that the continued fraction 


—j-e¢ -3-¢ -j-¢ | 

1 + 1 + 1 + 

diverges for each value of e > 0. Later, new types of sufficient conditions for 
convergence were found (Leighton and Wall [1], Jordan and Leighton [I], 
Leighton [2]), but all of these theorems required that at least an infinite subse- 
quence of the |a,| < }. This last condition was recently removed (Leighton 
[1]) by showing that (1.1) converges if 


, ! j | 2 
(1.1) a+elzi¢ielL jelz7—., 
1—m 
Gongi | Sm <i, | danse | 2 2+ m+ m| a2, | (n = 1, 2,3, ---) 


It will follow incidentally from Theorem 4.4 of the present paper that this 
condition can be removed in still different ways. 
We recall that the n-th approximant A,/B, of a continued fraction 


_. i re 
Bi + Be + 
is defined by means of the recursion relations 
(1.3) Ao = fo, Bo = 1, A; = BoB: + am, B, = fi, 
‘ An = B,A n—1 + GnAn —2, B,, = BrBrn— + a,B, —2, (n = 2, 3, 4, ae 


(1.2) Bo + 


Received November 15, 1938. 

1 Numbers in brackets refer to the bibliography. 

2 For bibliography on this criterion see Szdsz [1] and Leighton [1]. 
298 








ha 


‘On- 


of a 
llow 


-), 
nt } 


s for 


(1), 
ibse- 
hton 


this 








CONVERGENCE THEOREMS FOR CONTINUED FRACTIONS 299 


It will be useful to designate by A,,,/B,., (Perron [1], p. 14) the n-th approxi- 
mant of the truncated continued fraction 


;, Or41 42 
Bx + 


(1.4) oe ns 


The plan of the present paper is to form the continued fractions 


(1.5) dy + 


the approximants C,/D, of which are related to the approximants A,/B, of 
(1.1) as follows: 


C,.. = Axnir ’ D, = Bens: ’ (n = l, 2, 3, ah -), 
(1.6) Ae 
Co = B,’ Do => :. 


where k is a fixed positive integer = 2 and r is chosen from the numbers 0, 1, 
2,.--,k — 1. We are thus led to a set of k continued fractions (1.5) with 
the property that the k sequences of approximants C,/B, generated by them 
together comprise the totality of approximants A,/B, of the given continued 
fraction (1.1). A condition of Pringsheim (Perron [1], p. 254) for convergence 
will be applied to (1.5). Thus, each of the & continued fractions will converge. 
An additional condition will be added to insure that these continued fractions 
converge to the same limit. It will follow that (1.1) converges under these 
conditions. The result will be the sets of convergence criteria referred to in 
the first paragraph. 


2. A lemma. It is well known (Perron [1], p. 198) that if &, 4, t&,--- is 
any sequence of complex numbers, where ¢,_; ¥ t, (n = 1, 2, 3, --- ), the con- 
tinued fraction (1.2) with 

Bo = to, 
(2.1) a1 =t —bh, Bi _ l, 
bn—1 “br f,, ; th — t,s 
r= , r= , n=2,3,4,--: 
. fans —_ bs B Sana —_ tne ( 


has approximants A,/B, with the property that 
Aa @ la, B, = 1, (n = 0, 1, 2, --- ). 


lf the numbers ¢, are defined as quotients r,/s, , it will be useful to determine 
tumbers c, and d, such that the n-th approximant C,/D, of a continued frac- 
tion of the form (1.5) shall satisfy the following relations: 


Ca ™ fu Dy * i. (n = 1,2, 3, ---), 


(2.2) 


C= 2. D» = 1. 


So 














300 WALTER LEIGHTON 


To this end we shall recall (Perron [1], p. 196) that the n-th approximant 
U,/V, of the continued fraction 


8 + Ui a U1, U2a2 U23 3 Un—1 Un Qn 
U1Bi + U2Be + UsB3 + UnBn + 
has the properties 
, To 7 Ti » 
Us=-, Uj=u-, Vi = w, 
So $1 
(2.3) 
‘ r 
Un = Ue +++ Un—, Vn = Ute -++ Un, (n = 2,3, 4, -->), 
Sn 


The proof of the following lemma is now immediate. 


Lemma. If the numbers c, and d, are defined by the equations 


To 1 To 


So $1 So 
Ta-1 es Tn 
8 Sa— § 
gas 24 SSL 
(2 4) Sn-—2 | Tn-1 Tn-2 
i Sn-1 8n-2_] 
Tn = Tn-2 
$s 8 8n— 
,2 = 4 23 I. (mn = 2, 3, 4, -+*}, 
8n-1 Tn-l Tn-2 
| Sn-1 Sn-2_J 











the n-th approximant C,/D, of the continued fraction 


(2.5) do + xx 3 TF 
has the properties 
Ca * Tas im * 2; (n = 1, 2, 3, --+), 
_ C=2, D=l. 
So 
It is sufficient to observe that in (2.3) one may set uw: = 81, Un = Sn/Se1 


(n = 2,3, 4,---). 


3. The principal theorem. In the following result m is any fixed positive 
constant < 1 and 7 is one of the integers 0, 1, 2,--- ,k — 1. The integern, 
once chosen, is to be kept fixed, but its choice is quite arbitrary. The integer! 
is fixed = 2. 














nant 


ositive 
ger 1; 
teger | 











CONVERGENCE THEOREMS FOR CONTINUED FRACTIONS 301 
THEOREM 3.1. If the denominators B, (r = 0, 1, 2, --- ,k — 1) of the first k 
approximants of the continued fraction 


ai a2 


[4 fa i 
are ¥ 0, if the numbers By (A = 1, 2, 3, --- ) are ¥ 0, if 


(3.1) i+ 


(3.2) Onk+s4+1Be-2,nk+e41, Sm <1 
for each value of s except those = ro (mod k), and if 

(3.3) | B,Bes, 
(3.4) | Beran! 2 | Besrrl + | Qrgidage - ++ QnyeBrarnye| (mn = 1, 2,3, ---), 


IV 


B,| + | aide - ++ Qr41Brar4s (r = 0,1,2,---,k — 1), 


the continued fraction (3.1) converges. 


Thus for each integral value of k greater than or equal to 2, this theorem 
yields a convergence criterion corresponding to each of the k distinct choices 
of ro. 

To prove the theorem construct the continued fraction (2.5) the n-th ap- 
proximant C,,/D, of which satisfies the conditions 


cs = Ank+r; Co = 2 
(3.5) , 
D, = Buse, Do = 1 
We find by means of formulas (2.4) that 
A, haw Ae} 
do am B,’ qQ = Buse] $e rai B,. ’ 
i A (n_k+r a Ank+r 7] 


Bak++ / Bin—1k+r Burtr 


Cn , 
Bon—2)k+r A (n—1)k+r A (n-2)k-+r 
| Bon—1)k-+r B(n-2)k+r] 
Anktr — A (n—2)k-+r 
Bak Bur Bun-2 ' 
d, = n ae nk+r (n—2)k+r ‘ (n -_ 2, 3, 4, oé -). 


Bon-k+r A (n—1)k+r A (n—2)k+r 








| Bon—1yk+r Bin-2)k+r_] 
One can simplify these equations by means of the well known formula (Perron 
[1], p. 17) 


(3.6) Anya Ara _ (—1)*" aa -++ Briar 


Ros Ra Bry Bra ; 


obtaining 








302 WALTER LEIGHTON 


A; Byars 


rT 
dy = - > Qa = (—1) G02 °** Ar+i , d; = Bis,, 
B, B, 
Q°7 k+l Byn—yk r+1 
(3.7) C2. = (- 1) A(n—2) k+r4+2 A(n—2)k4+r43 °° * A(n—1)k4+r41 = ’ 
By-1,(n—2)k4r41 
Box-i —2)k+r4+1 
d, = —, (n = 2,3,4, --. 


By-1.(n—2k+r41 
It is clear that the numbers c, and d, are well-defined because of the pre 
cautionary hypotheses B, ~ 0, By1, ¥ 0 of the theorem. 
Pringsheim has shown (Perron [1], p. 254) that if the elements c, and d, ofa 
continued fraction 


3.8) bo se 

said d, + d;+ 

satisfy the conditions 

(3.9) d,| 2 \|cen| +1 (n = 1, 2, 3, ---), 


the continued fraction converges and the denominators of consecutive approxi 
mants have the property that 


(3.10) D, | — | De-1| & | Cree -- + Cr (n = 1,2, 3, ---} 
Conditions (3.9) subject to (3.7) are precisely conditions (3.3) together with 


| Bor-t.cn—ayegrga | 2 | Be-r.cn—epeerss 
(3.11) 

+ | Qcn—2)k4r420(n—2ykerg3 °° An—ykgr¢1Be—r.(n—pktru | 
where r = 0, 1, 2,---,&k — landn = 2, 3,4,---. Conditions (3.11), how- 
ever, are easily seen to be equivalent to conditions (3.4). Thus under the 
hypotheses of the theorem the continued fraction (3.8) subject to (3.7) converges 
for each of the k values of r. It follows that the sequences { Agni+/Bunsr} nat» 
where Ajns,/Bens, is the (kn + r)-th approximant of (3.1), converge for each 
ry = 0,1,2,---,4 — 1. It remains to prove that these k sequences have the 
same limit. 

To that end we observe that it is sufficient to prove that lim A), = 0, where 


n> 
A= Akn+r+1 = Aintr - @1Q2 *** Akn+r+t 
Ben+r+i Bin+r Bin+r Ben+r+t 
andr = m,7 +1,---,% +k — 2. The latter equality is a consequence of 


(3.6). By (3.10) and (3.7) we have 


B, , 

‘ ‘ | | —l,nk+r+1 

(3.12) | Benger | — | Bar+r | = | @1Qe2 oe Onk+r+1|° B , 
r 


and hence 


r | B, | E Dk+r | | 
3.13) An * ~ 2 = _—— 1 (n = 0, i, 2, vee), 
( | Bartr+t By-ink+r+1 | | Barter | 














pre 


of a 


Oxi 


} 
~“m 


with 


t+r+l 


how- 
r the 
erges 
_-— 
jn=0) 
each 
e the 


vhere 


ice of 








CONVERGENCE THEOREMS FOR CONTINUED FRACTIONS 303 


By a well-known result (Perron [1], p. 18) 


(3.14) Basis e1.0biest = Brana ~ Gerryrsdctesdicselaba 
Thus 
| 1 —_ | Bases | 
(3.15) A. = B, : Boney | 
Bur+r Bazar 
1 ed OAnk+r+2 Bio, nk+r42° 
Bonsyestr 


Now (3.12) insures us that | B,,,, | is strictly increasing with n, and hence 


that lim | Bax, | = B“’, where B“’ may be infinite. Applying conditions (3.2) 
no 
to (3.15), we observe that the denominator in the bracket is >1 — m. It 
follows that lim A’, = 0 (r = ro, 7 +1, ---, 70 + k — 2) whether B™ is finite 
n->o 


or infinite. The proof is complete. 

If now in (3.11) we set r = ro, mo + 1, --- , 7% + & — 2 and apply conditions 
(3.2), we obtain the following result. 

THEOREM 3.2. If the denominators B, (r = 0, 1, 2, --- ,k — 1) of the first k 
approximants of the continued fraction (3.1) are ¥ 0, if the numbers By1. 
(\= 1,2, 3,---) are ¥ 0, af 
(3.16) Onk+0+1 Beonktsi1 | Sm <1, 
for each value of s # ro (mod k), and if 


| By Busy = | B, | os |; QjQ2--- Qr41Bp_1.r41 (r = 0, i. 2, LA k- 1), 


V 


3.17) | Basa| = _~ [1 + | ayaaga +++ Anse—r|] 
| Qy | 


(A = Ao +1, Ao + 2, --- Ao +k — 1), 
(3.18) | Bo-1.r, = B,. 1,A a | Bott Ary+2 --* Bngtk By-iary+k | 
wheren = 2, 3,--- and X%» = (n — 2)k + 7 + 1, the continued fraction (3.1) 
converge 8. 
The proof is immediate. Corollaries of this theorem will be examined in the 
next section. We observe here that the theorem will remain valid if conditions 
3.18) are replaced by the conditions 


' ] 
3.19) Bu Ade = By. 1,Ao + a Qo + Boy. 1 rg +1 By —1,A9+k 
since (3.19) implies (3.18). 
4. Consequences of the preceding theorems. In this section we shall derive 


‘number of results from Theorems 3.1 and 3.2 by assigning special values to k. 
Suppose k = 2. Then B, = 1 (r = 0, 1) and Baan = Bia = 1 (A = 














304 WALTER LEIGHTON 


1, 2,3,---). We have the following corollary of Theorem 3.1, combining the 
case ro = 0 with that of ro = 1. 


lA 


THEOREM 4.1. If either | densi) S m < 1 (m = 1, 2, 3, +--+) OF | dy) 
m<1(n = 1,2,3,--- ), andi 


\1+a:; 21+ /)a1), 


1 + ayn + An+1 = ] + | An-1 An (n = 2. 3. sa 
the continued fraction 
a a2 
4.1 1 ake 
(4.1) + ier? 


converges. 

These are particularly simple conditions. For a further discussion of the 
case k = 2 the reader is referred to an earlier paper (Leighton [1]). We observe 
that since (1.1)’ imply the conditions of Theorem 4.1, this theorem and hence 
Theorem 3.1 are independent of earlier criteria. It will follow from Theorem 44, 
for example, that Theorem 3.1 is indeed more general than Theorem 4.1 a 
well. It may be pointed out that Pringsheim’s conditions (3.9) never apply 
to continued fractions of the form (4.1). 

We now consider the case k = 3, taking rm) = 0 for simplicity and observing 
that each theorem stated henceforth represents k theorems which can be derived 
from the given theorem by advancing suitably the subscripts involved. Herer 
runs over the values 0, 1, and 2, and By = B, = 1, By = 1 + a, Beaa= 


Bo,» = 1 + a2. Referring to Theorem 3.2 we must require that 
1+a, ~ 0 (n = 2, 3, ---} 
| Qzn42| Sm < 1, Genis| Sm <1, (n = O, 1, 2, --<} 
(4.2) 


B;\| 2 1 + | a,(1 + az) |, By| 2 1 + | aya2(1 + as) 
(1 + a2)Bs| 2 | 1 + a2| + | aaga3(1 + as) 


We can now state the following corollary of Theorem 3.2. 


THEOREM 4.2. If conditions (4.2) are satisfied, and if 


(4.3) | Bsan—1| = m| + m| asn41 | (n = 1, 2,3, 
| “3n—1 | 

(4.4) | Bsn | = m| - HM | Asn41 ] (n = 1, 2, 3, -"*h 
| “3n | 

(4.5) | Bs,3n+1 | = | Bo sn+1 | + m | A3n+4 Bo sn+4 | (n = 0,1, a 


the continued fraction (4.1) converges. 





where 








lA 


CONVERGENCE THEOREMS FOR CONTINUED FRACTIONS 305 


A simple consequence of this theorem will now be developed. In view of 
(4.2) we shall suppose henceforth that a, # —1 (n = 2, 3, --- ) and that 


(4.21) 0<6 S |admiu| Sms} (\=23;:n = 0,1,2,--- ), 


where 6 and m are independent of n and X. 
Write relations (4.3), (4.4) and (4.5) in expanded form: 


| dsn42(1 + dsni4) + (1 + i3n41) (1 + Asnis + digzn+4) | 


(4.3) 1 


=m 


+ m| dsn41 | 


| | sn | 
| dsn43(1 + dgn+s) + (1 + Qgn42)(1 + snus + 3n45) | 


(4.4) — 
= mm 
L | aan | 


| dgnsa(1 + dign+6) +( + sn+s) (1 + Asn45 + d3n+6) | 


= | 1] aa A3n+3 | - m | dsn4a(1 + A3n+6) |. 





+ m|aaes| 


(4.5)’ 


We shall endeavor to replace each of these conditions by simpler and less 
general conditions which when fulfilled will imply that conditions (4.3)’, (4.4)’ 
and (4.5)’ respectively are fulfilled. To that end we observe that conditions 
(4.5)’ will be satisfied if 


Qnsa(l + Ggnae) | — | (1 + Ganas + Gsnae)(1 + Gans) | 
> {1 + asngs | + m* \dsnza(1 + danse) |, 
which in turn will be true if 
(4.5)”” | dsnga \(1 — m) 2 (1 + m)[1 + m* | dansa | + (1 + 2m)). 
Finally (4.5)’’ will be satisfied if the conditions 


2(1 + m)° 


m — m? — m® 


(4.5)’” | danga| = 7 (m S 3;n = 0,1, 2, ---) 
are satisfied. 

We turn our attention to conditions (4.4)’ and observe that (4.4)’ will be 
satisfied if 


44)” =| danza] (1 — m) = (1 + m)* + m(1 + m) + m |? + m | dan41 |, 

or if 

44)” | danga| 2 a St mn? | ants | (¢ 2 1), 
1—m 1—m 


where m = 6t and n = 1, 2,3,---. 











306 WALTER LEIGHTON 
Conditions (4.3)’ will now be treated in somewhat similar fashion. One can 
verify readily that the following conditions imply (4.3)’: 
Q3nia°) 1 + Gansi + snse 
(4.3) : 
= t+ m | Ggna1| + lL + d3ni1 + Agns2 + Asni3 + A3n+143ni3 . 
Further conditions (4.3) are implied by conditions 


t+ m* | dsnai| +m +m 


Q3nu4| 2 1 + 
(4.3)/”" A3n41| — 1 —m 
(m = dt, t = 1;n = 1, 2,3, ---), 
Conditions (4.5)’” imply | djnsa | > 2(1 + m) (n = 0,1, 2,---). Thus the 
right member of (4.3)’” is less than [m* | dsnyi | + 1 + 2m + t + m']/(1 +m) 
(n = 1,2,3,---). It is easy to see that the right member of (4.4)’” for each 


n is greater than the right member of (4.3)’.. Thus, if conditions (4.4)”” are 
satisfied, conditions (4.3)’” are automatically fulfilled. 

Further, it is easily verified that if | ay | satisfies (4.5)’’, | dsn44 | also satisfies 
(4.5)’" (n = 1, 2, 3,--- ). We conclude then that the condition 


2(1 + m) 


(4.51) |a4| = = ; 
1 — m — m* — m' 


and (4.4)’” together imply (4.3), (4.4), and (4.5). 

We proceed to replace the final three “initial conditions” (4.2) by simpler 
and somewhat less general conditions. The first of these conditions | B;| 2 
1 + | a,(1 + as) | can be written 
1 + de a a3i— 1 

| l + a3 | ; 


The second of the last three conditions (4.2) can be written 


(4.6) | Qi | s 


(4.7) 1 + ae + a3 + ay + aoa, | = 1 + | ayae(l + ay) |. 
One verifies readily enough that if 
(4.8) 1—m—m{a,| > 0, 
(4.7) will be satisfied if 
(4.7) ja.| 2 mle. 
Finally, the last condition (4.2) will be satisfied if 
1+m 


l1—m 


(4.9)  |as(1 + as) + (1 + a2)(1 + ay + as) | 2 1 + m*\ai| 


Condition (4.9) will be satisfied if 


(4.9)’ |a,| 2 . |2 + 3m + m + m’|ai| i+ 4! 


l1—m 1—m 








P can 


tisfies 


mpler 





CONVERGENCE THEOREMS FOR CONTINUED FRACTIONS 307 


We thus have proved the following result. 
THEOREM 4.4. If the elements a, of (4.1) satisfy the conditions 


ae ~ —}, or G3 = —}, 


(4.10) 0<6 S lami.) Sms} (A = 2,3; = 0, 1, 2, --- ), 
|a,| 2 M, 
where M is the largest of the right members of (4.51), (4.7)’, and (4.9)’, and if 
, / 5 1+ 38m + 2m’ t | 
| A3n+4 | = ‘+ . oe m° | Asn | 


(4.11) l1—m 1—m 
(m = ét,t = 1,n = 1, 2,3, ---), 
the continued fraction (4.1) converges. 


The proof of the theorem can now be given briefly. Choose | a; | so small 
that (4.6) and (4.8) are satisfied. Now let M be the largest of the right members 
of (4.51), (4.7)’, and (4.9)’. Thus conditions (4.10) with this choice of M imply 
that conditions (4.2) are satisfied. By the remark following (4.51) conditions 
(4.11) imply (4.3), (4.4), and (4.5). We observe that the convergence of (4.1) 
is independent of the choice of a; ¥ 0. The rdéle of the conditions applied to 
a| in the proof of the theorem is thus purely catalytic and the conditions 
applied to this quantity having served their purpose may now be removed. 
The proof is complete. 

It is clear that similar theorems can be obtained for the case k = 3 if 7 is set 
equal to 1 and to 2 successively. The results will be analogous to Theorem 4.4 
with the subscripts suitably advanced in the conditions of that theorem. 

To illustrate the preceding theorem let us set m = 3, 6 = 4, and hence ¢ = 2. 
Conditions (4.11) become 


(4.12) | Asn4a| 2 10 + F | anys | (n = 1, 2,3, --- ). 


The right members of (4.51), (4.7)’ and (4.9)’ become respectively 24, (6 + | a; |) / 
U-|a|),3(15+3)a,|). We have the following result. 
Example. If the elements a, of (4.1) satisfy the conditions 


a, | 2 | 24), dz or a3 ¥ —}, 
tS | Ganur| S 3 (A = 2,3;n = 0,1,2,---), 
Anza | 2 10 + 3 | anys | (n = 1,2,3,---), 


the continued fraction (4.1) converges. 

The example provides another set of convergent continued fractions of the 
form (4.1) all of the elements of which can be greater than } in absolute value. 
ltis easily seen that the conditions of Theorems 4.3 and 4.4 are thus independent 
of earlier known conditions for the convergence of continued fractions (4.1). 











308 , WALTER LEIGHTON 


Indeed the conditions thus derived from the general theorem by setting k = 3 
are independent of those conditions (for example (1.1)’) derived from the general 
theorem by setting k = 2. 

We conclude with the following result. 


TueoreM 4.5. If the elements a, of (4.1) are any functions of any number of 
complex variables, the continued fraction (4.1) converges uniformly throughout any 
closed region characterized by the inequalities of Theorem 3.1. In particular, 
if the a, are analytic functions of a complex variable z, the continued fraction con- 
verges to an analytic function of z throughout the interior of any closed region 
throughout which the inequalities of Theorem 3.1 are valid. 


The proof may be given along well-known lines (cf. Perron [1], p. 260) and 
so is omitted. It is clear that in Theorem 4.5 the conditions of Theorem 3.1 
may be replaced by the conditions of any of the theorems of this paper or by 
those of the example without loss of validity. 


BIBLIOGRAPHY 


J. Q. JonpaNn AND W. LEIGHTON. 
1. On the permutation of the convergents of a continued fraction and related convergence 
criteria, Annals of Mathematics, vol. 39(1938), pp. 872-882. 
W. LEIGHTON. 
1. Sufficient conditions for the convergence of a continued fraction, this Journal, vol. 
4(1938), pp. 775-778. 
2. A test-ratio test for continued fractions, Bulletin of the American Mathematical 
Society, vol. 45(1939), pp. 97-100. 
W. LerGcHTon anv H.S. WALL. 
1. On the transformation and convergence of continued fractions, American Journal of 
Mathematics, vol. 58(1936), pp. 267-281. 
O. PERRON. 
1. Die Lehre von den Kettenbriichen, Leipzig, 1929. 
O. SzAsz. 
1. Uber die Erhaltung der Konvergenz Kettenbriiche bei independenter Verdnderlichkeit 
aller ihrer Elemente, Journal fiir Mathematik, vol. 147(1916), pp. 132-160. 


Tue Rice INsTITUTE. 





and 
of p 
intr 
that 
symi 
will 
the f 
the 


Yatr( 


Re 








= 
S 
= < 





GENERALIZED PROBLEM OF BOLZA IN THE CALCULUS 
OF VARIATIONS 


By M. R. HEstenes 


|. Introduction. The problem to be studied in the present paper is that of 
minimizing a function 


(1.1) I(C) = g(a) + [ f(a, x, y, y') dx 


ina class of admissible ares C of the form 
(1.2) a, yi(x) (m1 SrSm;h=1,---,r;t=1,---,n) 


in azy-space satisfying the conditions 


(1.3a) ¢,(a, x,y, y’) = 0 (y=1,---,m<n), 

(1.3b) tr, = 2,(a), yi(t,) = Yis(@) (s = 1, 2), 
z2 

(1.3¢) I, = 9,(a) + [ fa, 2, y, y')dx = 0 (o = 1,---, p). 
z 


The a’s are independent of the variable z. In the following pages it will be 
convenient to designate this problem as problem A. 

The problem just formulated can be modified in many ways. For example, 
one can suppose that the functions z,(a) are constants, since this result can be 
brought about by replacing z by a new variable ¢ by means of the transformation 
r=2,(a) + t[re(a) — 2,(a)] (0 S ¢ S 1). Moreover, one can assume that the 
functions g(a), g,(a) are identically zero, for along an admissible are C satisfying 
the conditions (1.3) the function (1.1) can be put in the form 


=2 
1(C) = [tf + g/zx(a) — x(a)} de 
71 
and a similar expression holds for 7,(C). The simplicity of these transformations 
of problem A is due to the presence of the a’s in the functions f, f,, ¢,. The 
introduction of the a’s in these functions not only enlarges the class of problems 
that are immediate special cases of our problem, but also gives rise to a more 
symmetric theory. This is particularly true in the theory of Mayer fields, as 
will be seen in §3 below. However, problem A can be reduced to one in which 
the functions f, f, , ¢, are independent of the a’s. This can be done by replacing 
the variables a,,---,a, in these functions by new variables yn4:(2), --- , 
Yai(2) satisfying the conditions yin = 0, yrsa(ts) = an (s = 1, 2). If one 


Received November 16, 1938. 


309 











310 M. R. HESTENES 


also replaced the values a, appearing in the functions g, g, and the end conditions 
(1.3b) by ynss(x1), one would obtain a problem that is a special case of that of 


minimizing a function 
T = glx: , y(ai), r2, y(r2)] + [ S(x,y, y') dx 


in a class of ares 
yi(r) (a7 Sx S252 = 1,---,n) 


satisfying conditions of the form 


g(a, y, y’) = 0 (y = 1, ---,m <n), 


T, = gola:, y(ar), te, y(re)] + / S(t, y, y'dx = 0. 


This general problem can also be reduced to one of type A by adjoining the 


conditions 
m= 4, Yilti) = Gini, Te = Anya, YilXe) = Anyize 


and expressing the functions g, g, in terms of the a’s by the use of these equations. 

The problem of Bolza as formulated by Bliss (I, II)’ is the special case 
of the problem described at the end of the last paragraph in which the functions 
f, are identically zero. Similarly, the problem of Bolza as formulated by Morse 
(VIII) is the special case of problem A in which the conditions J, = 0 are absent 
and the functions f, ¢, are independent of the a’s. Conversely, problem A can 
be reduced to one of Bolza type, namely, to that of minimizing the function (1.1) 
in the class of admissible ares 


Ap , y (x) (x; Szrinr;j= l,--- RTD 
satisfying the conditions (1.3a), (1.3b) and 
f(a, x,y, y’) = " e = (), Yns p(X1) = g(a), Ynsp(t2) = 0. 


The a’s appearing in the functions f, f, , ¢, can be replaced by new variables 
in the manner described above. The concepts of strong relative minima fe. 
problem A and the reduced problem of Bolza are, however, not equivalent con- 
cepts. It follows that one cannot obtain a complete theory for problem A 
from that for the problem of Bolza without additional arguments. In order to 
obtain a strong sufficiency theorem for problem A from that for the problem 
of Bolza one needs an effective theorem of Lindeberg. Such a theorem has 
been established recently by W. T. Reid (X; ef. V) who applies it to the problem 
described at the end of the second paragraph of this paper. A similar difficulty 
is encountered when one applies the theory of the problem of Bolza to the iso- 
perimetrie problem and to the problem of Euler recently studied by Brady (III) 


1 Roman numerals in parentheses refer to the references at the end of this paper. 








wl 


At 





ons 
t of 


 N), 


the 


jons. 
case 
rons 
[orse 
sent 
, Can 
(1.1) 


+ p) 


ables 
a fe. 
con- 
m A 
ler to 
blem 
a has 
yblem 
eulty 
@ 180- 


(III). 








PROBLEM OF BOLZA IN CALCULUS OF VARIATIONS 311 


On the other hand the problem of Bolza and the isoperimetric problems are 
immediate special cases of problem A. Similarly, the problem of Euler studied 
by Brady is the special case of problem A in which f = 0,9, = —a,, the func- 
tions f, , 1s , Yis are independent of the a’s and the differential equations g, = 0 
are absent. For these problems and others of similar type one can obtain com- 
plete sets of necessary conditions and sufficient conditions for strong relative 
minima from those for problem A without further arguments. 

During the last few years the problem of Bolza has been studied in great detail 
and the results obtained have been applied to problems that can be reduced 
to one of Bolza type. In view of the above remarks it appears to the author 
that it would be more economical and satisfactory to study first the generalized 
problem of Bolza here formulated or one of similar type and then apply the 
results so obtained to the various special cases. Most of the theory for prob- 
lem A can be established by simple modifications of the arguments used to 
develop the corresponding theory for the problem of Bolza. A new sufficiency 
proof is needed, however. It is the purpose of the present paper to give such 
a proof. The method used is an extension of one recently used by the author 
(VI) for the problem of Bolza. Numerous simplifications of the earlier method 
have been made. 


2. Preliminary remarks. In the following pages it will be assumed that the 
functions appearing in the expressions (1.1) and (1.3) are continuous and have 
continuous partial derivatives of the first three orders in a region ® of points 
(a, z, y, y’). We suppose further that 2:(a@) < 22(a) and the matrix || ¢,,: 
has rank m on &. The elements (a, z, y, y’) in ®R will be called admissible. 
By an admissible arc C will be meant a continuous are (1.2) in azy-space that 
can be subdivided into a finite number of subarcs on each of which it has con- 
tinuous derivatives and has its elements (a, z, y, y’) admissible. This definition 
of admissible ares is not the one previously used by the author, but is essentially 
the one used by Bliss (I, p. 10). 

By an extremal E will be meant an admissible are (1.2) without corners and 
a set of multipliers /, , m,(x) having continuous derivatives y;’, I, = 0, m., and 
satisfying the Euler-Lagrange equations 


_d 


(2.1 4 
) Fy, — = 


F,; = 0, ¢y =90 (y= 1,---,m), 


where 
F(a, z,y, y', l,m) =f+Lf,+m»,. 


An extremal E will be said to be non-singular if the determinant 


(2.2) Fyivi Prvi 


PBuk 0 








312 M. R. HESTENES 


is different from zero along Z. From existence theorems for differential equa- 
tions applied to equations (2.1) one obtains the following result (cf. I, pp. 33-36). 
THEOREM 2.1. Every non-singular extremal E is a member of an (r + 2n + p)- 
parameter family of extremals 
(2.3) a, 942,¢,5¢60, 4, mle, @ 66.9 (m1 Sr 2) 
for special values a, = an, bi = bo, ci = co, 1, = lw, ro SX S To, where 
h=1,---,r;¢=1,---,njp = 1,---,p. The functions y;, yic, my, 2; 
= Fy: , ziz of (x, a, b, c, l) determined by this family are continuous and have con- 
tinuous first and second derivatives with respect to their arguments in a neighborhood 
of the values (x, a, b, c, l) belonging to E. The determinant 
Yio; Yie; ee 
(i,j = I, “om 


2. CA 
“tb ; tc; 


is different from zero along E. The parameters b; , c; can be chosen to be the values 
of yi, 2; at a fixed value of x. 

The principal theorem to be proved in the present paper is the following one. 
Precise definitions of the terms used will be given below. 

THEOREM 2.2. SUFFICIENT CONDITIONS FOR A STRONG RELATIVE MINIMUM. 
If a non-singular extremal E satisfies the conditions (1.3), the transversality condi- 
tion (2.4), the Weierstrass condition I1y , and is such that the second variation of I 
along E is positive, then there is a neighborhood § of E in axy-space such that the 
inequality I(C) > I(E) holds for every admissible arc C in § satisfying the con- 
ditions (1.3) and not identical with E. 

An extremal E is said to satisfy the transversality condition if every set of 
constants da, , --- , da, satisfies with E the equation 


z2 
(2.4) dg+l1,dg, + ((F — yiFyj) dx + Fy;dyidi + / F,,da,dx = 0, 


where dz, , dy, , dz, dyi are the differentials of the second members of equa- 


tions (1.3b). 
An extremal E is said to satisfy the Weierstrass condition I1y if at each element 


(a, x, y, y’, l, m) in a neighborhood N of those on £ and satisfying the conditions 
¢, = 0 the inequality 

(2.5) E(a, z, y, y’, l,m, Y’) 2 0 

holds for every set (Y’) such that the element (a, x, y, Y’) is admissible and 
satisfies the conditions g, = 0. Here 

(2.6) E = F(Y’) — Fly) — (Yi -— wFusy), 

where the arguments in F and its derivatives not indicated are (a, z, y, /, m): 
For a non-singular arc E satisfying the condition IIy the equality in (2.5) will 








lues 


one. 


(UM. 
mi- 


of I 
t the 
con- 


ot, of 


qua- 


ment 
tions 


. and 


|, m). 
) will 








PROBLEM OF BOLZA IN CALCULUS OF VARIATIONS 313 


hold only in case (Y’) = (y’) if one chooses the neighborhood N so that the 
determinant (2.2) is different from zero on N. This result has been established 
recently by Reid and the author (VII). 

By the second variation of J along an extremal will be meant the expression 
(ef. VIII, pp. 520-521) 


(2.7) J (a, 0) = dronon + [ fukn, 2, 0 ode, 
whereh,k = 1, ---,r;7,7 =1,---,nand 
bie = gae + Ugone + ((F — yiFyi)atane + Fyiyionlemt 
(2.8a) + [(Fe — yiFy,)teaton + Fy; (cesyfion + Lentfion) + ZonMa, + Zeal e,)tZ2, 
28b) = Fyyniny + 2Fyyining + Poivininy + 2Fycag nin 


7’ Ul al 
+ 2F via, MHA + Pap Onen % 


Here and elsewhere the subscripts h, k on g, g, , Zs , Yie denote derivatives of the 
functions g(a), g,(a), z.(a), yis(a) with respect to a, , a, at the values of (a) be- 
longing to EZ. It is understood that the constants a, and the functions 7;(z) 
define an are in arn-space with continuity properties like those of admissible 
ares. These ares will be called admissible variations. The second variation 
J of I will be said to be positive along EZ if the inequality J(a, 7) > 0 holds for 
every set of admissible variations (a, 7) # (0, 0) satisfying with EZ the conditions 


(2.9a) ®,(a, z, 1, n’) = PryanXh + Pry Mi + Swit = 0 (y _ l, > i. m), 


(2.9b) ni(Xs) = Nish (s _ 1, 2), 
re 
(29) Isla, 9) = enon + [4a 2, 0, dz = 0 (0 =1,---,p), 
where 
(2.10) Nish = Yioh — Yi(Le)Len (s = 1, 2; s not summed) 
? 
(2.11) Coh = Jon + [f.ten]e=t a, = Soran &r + foys 2 + foyini« 


By an accessory extremal will be meant an admissible variation and a set of 
multipliers 


(2.12) Qh, n(x), Ap, p(x) (a1 Szs 22) 


having continuous derivatives 7;, 7:, \, = 0, wu, and satisfying the accessory 
differential equations 


(2.13) 2; — 5%; = 0, ©, = 0, 


where 


(2.14) 2 =a + A,w, + wyP, 





314 M. R. HESTENES 


and the functions w, w, , ®, are defined by equations (2.8b), (2.11), (2.9a), respee- 
tively. If Z is non-singular, an accessory extremal is uniquely determined by 


the values of the functions a, , 7, f; = Q,;, A, at a point z = 2. We may 
denote therefore an accessory extremal by the symbols a, , 9; , £: , Ap. 
A set of functions u;;(x), v;;(z) (j = 1, --- , n) will be said to form a conjugate 


system for a non-singular extremal EF if there exists a set of n linearly inde. 
pendent accessory extremals of the form 


(2.15) aj =O, ny = Uz, Fiji = Vij, Api = 0 (j = 1,--- , N) 


such that u;on. — vijuie = 0 (¢,7,k = 1,---,n). It is well known that if the 
last equation holds at one value of z, it holds for all values of x on x;22 (I, p. 80). 
The following lemma will be useful in §5 below. 


LemMA 2.1. If E is a non-singular extremal, there is a constant 6 > 0 such 
that for every subinterval x'x’’ on 222 of length at most 6 there exists a conjugate 
system uj; , vi; for E having | u;(x) | ¥ Oon 2'xr”’. 


For, if zo is a value on 2:22, then a set of accessory extremals of the form 
(2.15) with | :;(zo) | # O defines a conjugate system u;;, vi; with | u(x) | # 0 
on a neighborhood of z = xz. The lemma now follows by an application of 
the Heine-Borel Theorem. 


3. Mayer fields. A region § in ary-space and a set of slope functions p,(a, z, y) 
and multipliers 1,(a), m,(a, x, y), that are continuous and have continuous 
derivatives of the first two orders, will be said to define a Mayer field § if the 
sets (a, x, y, p) are admissible and satisfy the equations ¢,(a, x, y, p) = 0 and 
the expression 


(3.1) I*(C) = g*(a) +/ F*(a, x, y, y’) dz, 


where g* = g + lg, and 
F* = F(a, z, y, p, l,m) + (yi — pF i(a, 2, y, p, l, m) 


is independent of the path in § in the sense that the value /*(C) is the same 
for all admissible arcs C in § having the same end values [a, 2; , y(x1), 22 , y(2:)}. 
For an admissible are C in § satisfying the conditions (1.3c) one has the formula 


(3.2) (Cc) = I*(C) + [ E(a, x, y, p, l, m, y’) dz, 


where E is the Weierstrass E-function (2.6). 

A solution (1.2) of the equations y; = p,(a, x, y) can be shown (ef. I, pp. 102- 
103) to form an extremal with the multipliers 1,(a), m,la, x, y(x)]. Such an 
extremal will be called an extremal of the field. Through each point (a, 2, 9) 
in § there passes one and only one extremal of the field. Moreover, from the 
formula (3.2) it follows that the relation 1(E) = I*(E) holds for every extremal 
of the field satisfying the conditions (1.3c). 

















form 
~ 0 
n of 


z,y) 
uous 
f the 
) and 


same 
(x 2)}. 


mula 


102- 
ch all 
z, ¥) 
m the 
remal 








PROBLEM OF BOLZA IN CALCULUS OF VARIATIONS 315 


THEOREM 3.1. Let E be an extremal of a Mayer field § at each point of which 
the inequality 


(3.3) Ela, x, y, p(a, x, y), Ua), m(a, x, y), y'] > 0 


holds for every set (y’) # (p) such that (a, x, y, y’) is admissible and satisfies the 
equations g, = 0. Suppose further that the relation I*(C) 2 I*(E) holds for every 
admissible arc C in § satisfying the end conditions (1.3b), the equality holding only 
in case C and E have the same components a,. Then the inequality I(C) > I(E) 
holds for every admissible arc C in § satisfying the conditions (1.3) and not identical 
with E. 

For by virtue of the formula (3.2) and the hypotheses of the theorem one has 
1(C) = I*(C) 2 I*(E) = I(E) for every admissible are C in § satisfying the 
conditions (1.3). The equality holds throughout only in case y; = p; along C 
and the ares C and E have the same components a, and hence the same initial 
point, by equations (1.3b). But this implies that C is an extremal of the field 
with the same initial point as E. The are C is therefore identical with EF and 
the theorem is proved. 

The above theorem suggests the study of the problem of minimizing /*(C) 
in the class of admissible ares C satisfying the conditions (1.3b) but not neces- 
sarily the conditions (1.3a) and (1.3c). Suppose now that £ is an extremal of a 
Mayer field satisfying the conditions (1.3) and minimizing /* subject to the 
conditions (1.3b). Then E must satisfy the necessary conditions for a minimum 
for this problem. It is clear that E is an extremal relative to 7*. The trans- 
versality condition for this new problem is obtained by replacing F, g in (2.4) 
by F*, g*, respectively. By the use of equations (1.3) and y; = p, it is found 
that EF satisfies this new transversality condition if and only if it satisfies the 
condition (2.4). Moreover, the second variation /*(a, 7) of 7* along E must be 
non-negative for every admissible variation (a, 7) satisfying the conditions 
(2.9b). By a somewhat complicated but not difficult computation it is found 
that along E 


(3.4) J*(a, 9) = Dypanax + 2A,¢,,0, + 2/ ; }Q + (n. — 7;)Q,'} dz, 


where b,x , Cp, , 2 are defined by equations (2.8a), (2.11), (2.14) and the argu- 
ments in 2 and Q,: are (a, x, n, 7, A, uw), the values 7; , A, , uw, being determined 
by the equations 


(3.5) Ti = Pia, Oh + Diy; Mi, Ao = UpayQh, My = Mya, Qn + My; 7; . 


The second variation J* of J* along E is also expressible in the form 


72 
(3.6) J*(a, 9) = byranax + 2A,¢ pron + | (202 — Fytyi(n; — i) (m — m)} de, 








316 M. R. HESTENES 


where the arguments in 2 are now (a, zx, 7, 7’, A, uw). The equivalence of the 
formulas (3.4) and (3.6) is easily established by expanding 2Q[a, x, n, r + 
(n’ — x), d, wu] in terms of 9, — 7; by means of Taylor’s formula. 


THEOREM 3.2. Let E be an extremal of a Mayer field satisfying the conditions 
(1.3) and (2.4). Suppose that the second variation J*(a, n) of I* along E satisfies 
the condition J*(a, ) > 0 for every admissible arc (a, n) # (0, ) satisfying the 
conditions (2.9b). Then there is a neighborhood §, of E in axy-space such that 
the inequality I*(C) = I*(E) holds for every admissible arc C in § satisfying the 
conditions (1.3b), the equality holding only in case C and E have the same com- 
ponents ap . 


To prove this result we may suppose that the components a, belonging to £ 
are given by the seta, = 0. Let an, nx (kK = 1, --- , r) be a set of r admissible 
variations having continuous second derivatives, satisfying the conditions (2.9b) 
and having a,, = 1,axn = O0(h #k). Let 


yi(z, a) = Y<(x, a) + ha(a)[z2(a) — 2] + he(a)[x — 2,(a)], 
where Y;, hi, are defined by the equations 
Y; = yi(x) + Nika, hi. [x2(a) ied xi(a)] i Yis(a) r. Y{z,(a), al, 


the functions y,(x) being those belonging to E and the functions z,(a), yi(a) 
being those appearing in equations (1.3b). The r-parameter family of ad- 
missible ares 

(3.7) a, yi(x, a) (x(a) < x S 2,(a)) 


so obtained contains E for values a, = 0, satisfies the end conditions (1.3b) and 
has ax, , nix aS its variations along E. When the functions (3.7) are substituted 
in the expression (3.1) for I*, a function J*(a) is obtained having continuous 
first and second derivatives. The value of dJ* at a, = 0 is equal to the value 
of the first variation of J* along E determined by the admissible variation 
a, = da, , 7; = nuda, and is equal to zero since £ is an extremal for J* satisfying 
the transversality condition for J*, as was seen above. Similarly, the value of 
d’I* at a, = 0 is equal to the value of J*(a, 7) determined by the admissible 
variation just described. Hence d’°J* > 0 for all (da) ¥ (0) by virtue of our 
hypothesis concerning J*(a, 7). It follows that I*(a) > I*(0) = I*(E) for 
every set (a) ~ (0) in a neighborhood A of (a) = (0). Let § be all points 
(a, x, y) in § with (a) in A and consider an admissible are C in §; satisfying the 
conditions (1.3b). Since the components a, belonging to C determine an are 
(3.7) joining the ends of C, one has I*(C) = I*(a). Hence I*(C) = I*(B), the 
equality holding only in case C and E have the same components a,. This 
proves Theorem 3.2. 

THEOREM 3.3. Let E be an extremal of a Mayer field § satisfying the conditions 
(1.3). The set of all points (a, x, ) with x on x,x2 together with the slope functions 
x; and multipliers d, , u» given by (3.5) define an accessory Mayer field for the 














r tok 
issible 
(2.9b) 


b) and 
‘ituted 
inuous 
- value 
riation 
isfying 
alue of 
rissible 
of our 
E) for 
points 
ing the 
an are 
E), the 

This 


. 


uditions 
nections 
for the 











PROBLEM OF BOLZA IN CALCULUS OF VARIATIONS 317 


problem of minimizing the second variation J(a, n) of I along E subject to the 
conditions (2.9). The invariant integral for this field is given by (3.4). 


In order to establish this result let a, , n(x) and a, #;(x) be admissible 
variations having the same a’s and 7n.(z,) = 4:(7.) (s = 1, 2). Let a, = 0, 
yi(z) (v1 S Z S 22) be the functions defining E and consider the family of ad- 
missible arcs 


(3.8) a,(e) = ea, yi(x, e) = yi(x) + eni(x) (4% Sf S 2a). 


Let I*(e) be the values of I* determined by this family, and denote by I*(e) 
the values of 7* determined by the family obtained from (3.8) when 7; is re- 
placed by 4;. Since the ares belonging to these two families for a particular 
value of e have the same end values, the function H(e) = I*(e) — I*(e) is iden- 
tically zero. But H’’(0) = J*(a, n) — J*(a, 4), as one readily verifies. We 
have accordingly J*(a, ») = J*(a, 4), as was to be proved. 


THEOREM 3.4. Let E be a non-singular extremal satisfying the conditions (1.3). 
Suppose that there exists an accessory Mayer field for the second variation J (a, ) 
of I along E of the type described in the last theorem. Then E is an extremal of a 
Mayer field § such that the slope functions and multipliers of the accessory Mayer 
field are the variations (3.5) along E of the slope functions and multipliers of §. 


This theorem can be established by an argument similar to those made in the 
next section. However, the proof of this theorem will be omitted in view of 
the fact that we shall make no explicit use of this result. 


4. Construction of fields. The following theorem establishes the existence 
of fields of the type described above. 


THEOREM 4.1. Let E be a non-singular extremal for which there exists a con- 
jugate system u;; , vi; having | u;;(x) | # 0 along it and let 
(4.1) Ahk ; niz(x), Xk P byn(2) [arnn = ], ank = 0 (h = k); h, k= l,--- rl 


be a set of r accessory extremals for E. There exists an (r + n)-parameter family 
of extremals 


(4.2) a,, yx,a,e), I,(a), m,(x, a, e) 


containing E for values x} S x S %2, a, = Qo (h = 1,---,r), e: = eo (§ = 1, 

+,m). The functions y;, yiz, lp, my, 2: = Fy: of (x, a, e) determined by this 

family have continuous first and second derivatives in a neighborhood of the values 

(zt, a, e) belonging to E. Moreover, along E one has 

Yia, = Nik, las = Apu, Mya, = Myk , 

(4.3) 
Yiex = Uiz, ies = Vij 

The extremal E is an extremal of a Mayer field § with slope functions and multi- 

pliers 


(44) pi(a, 2, y) = yizlx, a, e(a, x, y)], L,(a), m,(a, x, y) = m,|[x, a, e(a, x, y)], 











318 M. R. HESTENES 


where e;(a, x, y) is the value of e; belonging to the extremal (4.2) passing through 
the point (a, x, y) in &. 


To prove this suppose that aj, = 0 and that the parameters 0b, , c; in the 
family (2.3) have been chosen to be the values of y;, z; at a point x = 2 on 
ate. A family (4.2) having the properties described in the theorem can be 


obtained from the family (2.3) by setting 


= 


- bio + nin(Xo) ap + Ui (Xo)e; , 


(4.5) | 
= Co + Cin(2o)ay + V; (toe; ’ F = l,o + Dd xy ; 


~ 


where ¢ are the values of Q,; determined by the accessory extremals (4.1). 
The continuity properties of the family are immediate. From equations (4.5) 
and the identities 


b; = y:(xo, a, e), Cy = 2:(% , @, €) 


in a, , ¢; it is found by differentiation that equations (4.3) hold at x = 2» and 
hence along E since these functions are related in a unique manner with accessory 
extremals. It follows that the determinant | y;., | is different from zero along E. 
The equations y; = y;(x, a, e) accordingly have unique solutions e (a, z, y) in 
a neighborhood § of E in ary-space. On the hyperspace x = 2 , a, = const, 


the Hilbert integral 
/ {F(a, x, y, p, l, m) dx + (dy; — pidx)F,:(a, x, y, p, l, m)} 


determined by the functions (4.4) takes the form f c;db; = { dW, where 
2We) = 2c + Sinan)uije; + usjvinejer (i,j,k = 1, +--+,” 


and hence is independent of the path in § (I, p. 106). It follows that the region 
§ and the functions (4.4) define a Mayer field whose extremals are given by the 
family (4.2). This proves Theorem 4.1. 


THEOREM 4.2. Suppose the hypotheses of Theorem 4.1 hold and let 
(4.6) r= Nik ces + Nise; , Ap = Apeae , My = Myer + Vy; €;; 


where €; = €;(a, x, n) are the solutions of the equations ni = nixax + Uijé; and 
v,;(x) are the multipliers wu, belonging to the accessory extremals (2.15) defining 
the conjugate system u;;, vi;. The functions (4.6) are the slope functions and 
multipliers of the accessory Mayer field associated with the field § described i 
Theorem 4.1. In fact if the second variation J(a, n) of I along E is positive along 
E, the accessory extremals (4.1) can be chosen so that the invariant integral (3.4) 
for the accessory Mayer field satisfies the condition J*(a, n) > 0 for every admissible 
variation (a, n) # (0, n) satisfying the conditions (2.9b). 








int 





rough 


n the 
Xo ON 
an be 


(4.1): 
(4.5) 


ro and 
PSSOry 
ong E. 
-y) in 





-onst., 


..n) 


region 
by the 


, 


ve; and 
efining 
ns and 
ibed in 
e along 
il (3.4) 
nissible 








PROBLEM OF BOLZA IN CALCULUS OF VARIATIONS 319 


To prove this we note that if in the identity y; = y,[z, a, e(a, x, y)] the va- 
riables yi, @, are replaced by y; + bn, a, + ba, and the result is differentiated 
for b, one obtains upon setting b = 0 and using the relations (4.3) the identity 


Ni = Nik Ay + Ui ;(€ jap, &H + € iu, Mk) 


along E. It follows that e;(a, x, ») are the variations of e;(a, x, y) along E. 
The variations of the functions (4.4) along E are therefore given by the set (4.6). 
The first part of the theorem now follows from Theorem 3.3. The proof of the 
last statement of the theorem is based on two lemmas, the first of which is the 
following 


LemMaA 4.1. Let E be a non-singular extremal and 


(4.7) Chi, nij(2), by (2X) (ij =1,---,é 


a maximal set of accessory extremals for E such that the variations a; , n;; are 
linearly independent on x,x2. If the second variation J(a, ) of I is positive 
along E, the matrix whose j-th row is given by the set 


(4.8) Anj , nij(X2), nij(X1), J (a; , ni) 


has rank t. Moreover t = r + 2n + p — q, where q is the number of linearly 
independent accessory extremals ap, , ni , Xp, Hy in a maximal set having a, = n(x) 
= (0) on x22. 


For suppose the matrix described in the lemma did not have rank ¢. Then 
there would exist multipliers 8; , not all zero, such that 


anjBy = niy(t2)B; = nj(X1)B; = J,(a;, 0;)B; = 0. 


The accessory extremal 
a, = anjB;, ae ni; B; ’ Ap — d,jB; ; a ieee My 7B; 


would have a, = (21) = ni(xe) = J,(a, n) = Oand 7 4 Oon zz. By an 
integration by parts with the help of equations (2.13) it would be found that 


J (a, n) = J(a, n)+ 2d,J,(a, 9) = / 20dz = [ni Qai]i = Q, 
1 


contrary to our assumption that the second variation is positive along Z. This 
proves the first part of the lemma. The last statement follows from the fact 
that there are r + 2n + p linearly independent accessory extremals in a maximal 
set. 

Consider now the expression 


(4.9) H,(a, 9) = led (a, n) + / : MeP,(a, n, 9’) dx (o=1,---,q), 








320 M. R. HESTENES 


where ’,., M,, are the multipliers belonging to a maximal set of linearly inde. 


pendent accessory extremals a, , 7; , \,. “#, of the form 
(4.10) Ga, = 0, Vie = 0, a Myo (o = 1, ---,9) 
Let zie(x) be the corresponding values of Q,; and let w,, be the coefficient of a 


in the expression (4.9). By the use of the accessory equations (2.13) for the 
accessory extremals (4.10) it is found that 


(4.11) Hla, 1) = Weran + / teiemi + Zien} dX = Wenn + [zie nili- 
7 


The second lemma to be established is the following 

Lemma 4.2. If & , 4; is an admissible variation satisfying with a set of constant 
J , the equations H,(a, n) = Ipod, , there exists an accessory extremal ay, , i, Xp, 
such that 
(4.12) a, = &, n(x.) = 4:(25) (s = 1, 2), J (a, ») = J,. 


In particular if ®,(&, x, 4, 4°) = 0, then J,(a, n) = J ,(&, 4%). 
For since the accessory extremals (4.10) are linearly independent, the matrix 
Zie(t) Ip || has rank g. The equations 


Woh Ah + Zie(X2) Ni —_ Zie(@1) Ni = bed o 


are therefore linearly independent equations in the variables a, , ni2, 2, J, 
and have t = r + 2n + p — q linearly independent solutions. By virtue of th 
relations (4.9) and (4.11) a maximal set of linearly independent solutions of thee 
equations is given by the set (4.8). There is accordingly a linear combination 
Gr, Ni, Ap, My Of the extremals (4.7) satisfying equations (4.12). This prove 
the lemma. 

We are now in position to complete the proof of Theorem 4.2. To do sole 


(4.13) nk n(x), Nok» u(x) (k 


be a set of r accessory extremals having | a, | # 0 and such that the last r -' 
of these form a maximal set of accessory extremals satisfying the condition 
(2.9) and having its matrix || a, || (J =r’ + 1,---,r)ofrankr—r’. Wema 
suppose without loss of generality that ax, = 1, an = 0 (hk # k) since thi 
choice is always possible if we first transform the a’s by the transformatio 


a, = Ay + andy , Where ajo are the values of a, belonging to Z. Let a, te 
(k = 1, --- , r) bea set of r admissible variations satisfying the conditions (2.9 
and having &. = ax, fu = na (l = 7’ +1,---,r). Itis clear from equations 


(4.9) and (2.9) that the values H., = H.(a , 7%) are zero when k > r’. Hot 
ever, the matrix || H,, || has rank r’. Otherwise there would exist constant 
8, (r = 1, --- , 7’), not all zero, such that H.,8, = H.(&,8, , #-8,) = 0. But by 
Lemma 4.2 with J, = 0 there would exist an accessory extremal a , 7; , Ap+# 


satisfying the equations 


> Grr B, ’ ni(as) — Hir(as)B, (s = - 4), J ,(a, n) = 0 














r inde. 


4) 


t of a 
‘or the 


nstants 
Nps th 





matrix 


Nil, d, 
e of th 
of these 
ination 
prove 


lo 0 let 


«an 


tr <T 
nditions 
We may 
nee this 
rmation 
nk » Mi 
1s (2.9 
juation: 
Hor 
ynstanis 

But by 


, Nps 








PROBLEM OF BOLZA IN CALCULUS OF VARIATIONS 321 


and hence also equations (2.9), contrary to our choice of the last r — r’ of the 
accessory extremals (4.13). 
Consider now the accessory extremals 


(4.14) Ahk » nix(2), X pk + bl pe Hex ’ Myk + bse Hex ’ 


where I,. , My are the multipliers belonging to the extremals (4.10) and 6 is a 
constant to be chosen below. Let J*(a, 7) and Ji (a, n) be the invariant in- 
tegrals (3.6) for the accessory Mayer fields determined respectively by the 
accessory extremals (4.13) and (4.14) as described in the first part of Theorem 
42. By the use of formulas (3.6) and (4.9) one finds that 


Ji (a, 0) = J*(a, ) + 2bHax Ha, 7). 


It follows from this equation, the relation H,, = 0 (k > r’) and the definition 
of H.(a, n) that the value of J7 (a, n) determined by the variation a , 7; = fimo 
is given by the formula 


(4.15) Ji(a, tron) = J*(a, trax) + 2bHer Ho aa, (7,v=1,---,r’). 


When the first r’ of the a’s are zero, the are a, , 7; = xa, and the multipliers 
Ape » x0, define an extremal of each of the accessory Mayer fields just con- 
structed. One then has J? (a, Hira.) = J(a, n) > Oif (a) ¥ (0), since the second 
variation J of J is positive along E. On the other hand the last term in (4.15) 
isa positive definite quadratic form in the first r’ of the a’s since the matrix 

H., || (¢, 7 = 1,---,7r’) has rank r’. It follows from the theory of quadratic 
forms that if the constant b is chosen sufficiently large one has Ji (a, fro.) > O 
for all (#2) # (0). Moreover, for an arbitrary are (a, 7) # (0, 7) satisfying 
the conditions (2.9b) one has 7,(z,) = #:(2s)ax (s = 1, 2) and hence J? (a, n) 
= Ji(a, trax) > 0. The accessory extremals (4.14) accordingly have the 
properties described in Theorem 4.2 and the theorem is established. 


5. Proof of Theorem 2.2. In the proof of the sufficiency conditions described 
in Theorem 2.2 we make use of the following lemma. 


Lemma 5.1. In the proof of Theorem 2.2 one can assume without loss of gen- 
rality that there exists for the extremal E a conjugate system u;; , vi; having | u;;(x) 
#0 on x22. 


If one accepts for the moment the truth of this lemma one can prove Theorem 
22as follows: Select for E a set of r accessory extremals (4.1) having the proper- 
ties described in Theorem 4.2 and let § be a Mayer field related to E and these 
accessory extremals in the manner described in Theorem 4.1. By the use of 
Theorems 3.2 and 4.2 it is found that if one takes the neighborhood § of E 
wuficiently small, the relation 7*(C) > I*(E) will hold for every admissible are 
in} satisfying the conditions (1.3b) and having its components a, distinct from 
those belonging to E. Moreover, one can choose § so small that the elements 
a, 2, y, p(a, x, y), Ua), m(a, x, y)] of the field will lie in the neighborhood N 











322 M. R. HESTENES 


of the elements (a, x, y, y’, l, m) on E prescribed by the Weierstrass condition 
IIy and on which the determinant (2.2) is different from zero. From the m 
marks following formula (2.6) one finds that the condition (3.3) holds in j 
Theorem 2.2 now follows from Theorem 3.1. 

Lemma 5.1 will be established by a transformation of our problem similar 
to that used by Denbow (IV). In the proof of Lemma 5.1 we can suppose that 
the functions z,(a) appearing in equations (1.3b) are constants since this result 
can be brought about by replacing x by a new variable ¢ by means of the trans 
formation x = 2,(a) + ¢t[z.(a) — x,(a)} (0 StS 1). Let 6 bea constant related 
to E as described in Lemma 2.1. No generality is lost in assuming that x, = 0, 
ze = gé, where q is an integer. In fact we may suppose that 6 = 1 since this 
result can be brought about by setting x = ét. We then have x, = 0, 2 =4 
Let C be an admissible arc (1.2) satisfying the conditions (1.3) and denote by 
C, (c = 1, --- , g) the subare of C determined by the interval o — 1 Sz So. 
The problem here studied, which we have designated as problem A, can be 
considered as the problem of minimizing the expression 


I = g(a) + | fdx+.--- +/ fdx 
C1 Ce 
in the class of admissible subares C, , --- , C, such that 


I, = 9,(a) + ff seas os +f tax =0 


and having the following properties: Each subare satisfies the condition 


¢y = 0; the final end point (a, x, y) = (a, r, b-) of C, (r < q) is identical with § 


the initial point of C,,; ; the initial point of C; is (a, 0, y:(a@)) and the final end 
point of C, is (a, g, ye(a)). If we map the subare C, into an are in atY,-spact 
by the transformation 


rz=X,(t) =o —1 +4, yi = Yeu, 
it is seen that the above formulation of problem A is equivalent to the problem 
of minimizing the expression 
1 
I = g(a) + [ (f(a, Xi), ¥1, Yt) + --- + f(a, XA), Yo, Yad} dt 
in the class of admissible ares 
Mp , bis , Y i-(t) (0 st s)) 
(h => 2. see rit = 3 cee ST = . cee +q _ i:¢ = ], --- ,9) 


satisfying the conditions 
¢y(a, X.(t), ¥., ¥2) = 0, 


Yu(0) = ya(a), Y;(0) = bie = Yir(1), Vil) = yala) (v=rth) 


1 
I, = g(a) + | (f(a, Xi), Y1, ¥1) + --- + f,(a, X,(0), Y_, Yq)} dt =6 





VI 





dition 
he re 
in §. 


similar 
se that 
result 
trans- 
related 
1= 0, 
ce this 
fe = 9 
ote by 
x So. 
can be 


ditions 


al with & 


nal end 
‘espace 


problem 


dt 


t <1), 





PROBLEM OF BOLZA IN CALCULUS OF VARIATIONS 323 


By considering subarcs, we can easily see that the extremal £ with multipliers 
a m,(x) is transformed into an extremal E* with multipliers 1, , m,[X,(¢)]. 
The Weierstrass and non-singularity conditions for E* are equivalent to those 
for the subares of E and hence to those for E itself. Moreover, the first and 
second variations of J along E are transformed into the first and second varia- 
tions of the function J along E*, admissible variations being transformed in the 
same manner as admissible ares. Furthermore accessory extremals for E with 
discontinuities at integral values of x are transformed into accessory extremals 
for E* without discontinuities. Finally by Lemma 2.1 and our choice of the 
subintervals o — 1 S x &S o there exist g conjugate systems uj;, vi; (¢ = 1, 

, 9) such that for each o the determinant | uj (x) | is different from zero on 
sc-ls2rseo. Let Uic,;(t), Vic (t) (,j3 = 1,---,n3 0,» = 1,---,q) be 
identically zero when o # v and equal to u{,[X.(t)], Vi;{X.(t)] when o = ». 
The functions so defined form a conjugate system for E*, as one readily verifies. 
Moreover, the determinant | l’;.,;, | where the indices 7, ¢ determine the rows 
and j, v the columns, is equal to the product of the determinants | uj | X,(é)] | 
and is accordingly different from zeroon0 S$ ¢ <1. This proves Lemma 5.1, 
and the proof of Theorem 2.2 is complete. 


THEOREM 5.1. Let E be an extremal satisfying the hypotheses of Theorem 2.2 
and such that there exists for E a conjugate system u;; , v;; having | u;;(x) | ¥ 0 
on tt2. Then there exists a set of multipliers l,(a) such that E affords a proper 
strong relative minimum to the expression I + 1,(a)I, relative to neighboring ad- 
missible arcs satisfying the conditions (1.3a) and (1.3b) but not necessarily (1.3c). 


To prove this we choose 1,(a) to be the multipliers of the Mayer field used 
in the proof of Theorem 2.2. By examining the proof of Theorems 2.2 and 3.1, 
we easily see that these multipliers have the property described in Theorem 5.1. 


REFERENCES 


I. G. A. Buss, The problem of Bolza in the calculus of variations, Lectures delivered 
at the University of Chicago, Winter Quarter, 1935. See also Annals of Mathe- 
matics, vol. 33(1932), pp. 261-274. 

II. G. A. Buss, Normality and abnormality in the calculus of variations, Transactions 
of the American Mathematical Society, vol. 43(1938), pp. 365-376. 

II]. C. P. Brapy, The minimum of a function of integrals in the calculus of variations, 
Dissertation, The University of Chicago, 1938. 

IV. C. H. Densow, A generalized form of the problem of Bolza, Contributions to the 
calculus of variations, 1933-1937, The University of Chicago, pp. 449-484. 

V. W. L. Duren, Contractable problems of Bolza, Bulletin of the American Mathe- 
matical Society, vol. 42(1936), p. 812, abstract no. 418, and vol. 44(1938), p. 32, 
abstract no. 17. 

VI. M. R. Hesrenes, A direct sufficiency proof for the problem of Bolza in the calculus of 
variations, Transactions of the American Mathematical Society, vol. 42(1937), 
pp. 141-153. 

VII. M. R. Hestengs anv W. T. ReE1p, A note on the Weierstrass condition in the calculus 
of variations, to appear in Bulletin of the American Mathematical Society, vol. 
45 (1939). 

VIII. Marston Morse, Sufficient conditions in the problem of Lagrange with variable end 
points, American Journal of Mathematics, vol. 53(1931), pp. 517-546. 











324 M. R. HESTENES 


IX. W. T. Rei, A direct expansion proof of sufficient conditions for the non-parametric 
problem of Bolza, Transactions of the American Mathematical Society, vol. 4 
(1937), pp. 183-190. 

X. W. T. Rein, /soperimetric problems of Bolza in non-parametric form (to be published 


soon). 
Further references will be found in these papers. 


Tue UNIVERSITY OF CHICAGO. 








0 





metric 
rol. 42 


lished 








A GENERALIZED LAMBERT SERIES 


By J. M. Dossie 


|. Introduction. In his important paper of 1913 Knopp [5]' proposed to call 
any series of the form 
2 
n n 1 
(1.1) Dd bax(1 — 2”) 
n=l 
a Lambert series in honor of J. H. Lambert who in 1771 was the first to treat 
special series of this type [6]. 
The purpose of this paper is to discuss a more general series, namely: 


2 
An n 
(1.2) > 6,2 A(z’), 
n=1 
where \ is a positive integer and A(x) is a function of x which is analytic in the 
interior of the unit circle and which has a value different from zero at x = 0. 
In §§3 and 4 we suppose further that A(z) has on the unit circle a finite number 
of singularities of which at least one is a pole. 
In his paper Knopp proves the following theorem.” 
Let the coefficients b, of the series (1.1) be such that for a definite integer 
kall the k series 
= b 
(1.3) p (i= 0,1,2,---,k-—1) 
v=] kv + l y 
converge. Then if for such a k and for k’ prime to k we write 2 =e" “, we 
have for radial approach of x to xo the relation 


la \ 


eo ) 2 
. + , n 1\ Div 
lim < (1 — r/x) >> bax"(1 — 2") ‘=> : 
rz | n=l ) v=1 KV 

From this theorem it follows that the function defined by the series (1.1) cannot 
be continued analytically across the unit circle if the hypotheses of the theorem 
are satisfied for an infinite number of values of k for each of which the series 
xz 

py b,,/(kv) satisfies the additional condition of having its sum different from zero. 


r=] 
Recently Mary Cleophas Garvin [2] obtained corresponding results for series 
of the form 


ba) 
(1.4) > bar"(1 — 2"), 
n=] 
which are obtained from our series in the case h(x) = 1/(1 — 2“). 


Received November 16, 1938. 
‘The numbers in brackets refer to the list of references at the end of the paper. 
* Theorem 3 of §2. 


325 











326 J. M. DOBBIE 


Later in the same year in which Knopp’s paper appeared Hardy [3] showed 
that Knopp’s theorem can be generalized by replacing the hypothesis of the 
convergence of the series (1.3) by the hypothesis that these series are 
summable Cesaro (C, p) for some non-negative integer p. He also 
showed that a part of this latter hypothesis could be replaced by a more general 

o 
one, namely, that the p-th Cesdro sums A? formed from the series >> a,, 


v=0 


= >} bus: (1 = 1,2,---,k — 1) satisfy the relation A?, = o(v”*’), the hy- 


v=0 


i] 
pothesis on the series , b,,/(kv) and the conclusion remaining the same. He 


y=l 
pointed out that this latter theorem, as well as Knopp’s, is equivalent to two 
theorems, the second of which he states in more general form. 
In §§3 and 4 of this paper we prove theorems for our generalized series (1.2) 
which correspond to these theorems of Hardy for ordinary Lambert series, 
The theorems of §§3 and 4 are the most important results of this paper. 


2. Definition of the series. Convergence. Necessary and sufficient con- 
ditions for expanding a function in a series of this type. Let h(t) be a function 
which is analytic in the interior of the unit circle and which has a value different 
from zero att = 0. Write its power series expansion as 


(2.1) hit) = ay + ail + Orsot° y 


where a, ~ 0 and ) is a positive integer. Then form the series 


(2.2) fi) = > b, 0” A(t"). 


This affords a generalization of the Lambert series (1.1) and the series (1.4) of 
Garvin. 
Using well known methods, we can easily prove 


x 
a . An ° . ° ° 
THEOREM 2.1. If the series > b,t*” converges in the interior of the circle t| =! 


n=l 
(whence r S 1), then the series (2.2) converges absolutely and uniformly in every 
closed region in the interior of |t| = r. Moreover, expansions of the form (22) 


are unique. 
The uniqueness of the expansions follows from the hypothesis that h(0) #4. 
ox 
For, if f(t) has a second convergent expansion > B,t*"h(t") in the neighborhood 


n=l 
of t = 0 and we equate the derivatives of order kA (k = 1, 2, 3, --- ) of the two 
expansions at ¢ = 0, we get B,(kA)!A(0) = by(kKA)!A(0). Since h(O) # 0, By =h 
(k = 1,2,---). 

From this theorem we have that if the associated power series 


(2.3) ) ath 








at 





owed 
f the 
; are 

also 
neral 


~ 


a 4) 

0 

e hy- 
He 


oO two 


(1.2) 


series, 


- COn- 
netion 
ferent 


n every 
m (2.2) 


)) #0. 
orhood 


‘he two 


By, = bs 








A GENERALIZED LAMBERT SERIES 327 


has a radius of convergence different from zero, then the function f(#) is analytic 
in the neighborhood of the origin and has a power series expansion of the form 


(2.4) J) = al + Qual + aust? +..-. 


From (2.1) and (2.2) we have 


(2.5) fi) = ps balayt” + a” + anyet?*?" + -- -). 


n=l 


Comparing (2.5) with (2.4) we see that 


(2.6) a, = »» bana ; 
|” 


the sum running over all the divisors d of n, with the understanding that a; = 0, 
i<. In general, the system of equations (2.6) cannot be solved uniquely 
for the b’s in terms of the a’s. However, if \ = 1, this is possible, since in this 
case the equations (2.6) are consistent in the b’s and the recurrence relation 
becomes 

bia = a; baa = a, — 2 ba an/a (n = 2, 3, 4, wile -), 
in which a; ~ 0. 

Hence, for every function analytic at zero and vanishing there, such as that in 
(2.4) (with X = 1), there exists a formal transformation into a series of the form 
(2.2) (with A = 1). It is easy to show that this series always converges in the 
neighborhood of = 0. Thus, we have 


THEOREM 2.2. A necessary and sufficient condition that a function f(t) shall 
admit a convergent expansion in the form >, bat"h(t") is that f(t) shall be analytic 
n=1 
at = 0 and shall vanish there. 


If, for general A, we consider only those equations in (2.6) for which n = ky 
(k = 1, 2,3, --- ), the resulting expansion >, b,t*"h(t") converges in the neigh- 
n=1 
borhood of ¢ = 0 but does not necessarily represent f(t) (as given in (2.4)) there, 
as it is certain that the derivatives of orders 0, 1, 2, --- ,% — 1, A, 2A, 3A, --- 
only coincide at ¢ = 0. However, if h(t) and f(é) are such that a, = a, = 0 
for all values of n which are not positive integral multiples of \, we get 


THEOREM 2.3. A necessary and sufficient condition that a function f(t) shall 


admit a convergent expansion in the form 7 b,t h(t") is that f(t) shall be analytic 


n=1 


att = 0 and shall vanish there. 








328 J. M. DOBBIE 


From this last theorem it is seen that we may confine our attention to the 
case ) = 1 in seeking the solution of the system (2.6) for series of the form 


> b,t"h(t"). Asa special case of (2.6) consider 
n=] 


(1 forn = l, 
(2.7) D> Bacensa = 4 a 6; = 1, 


din 0 otherwise, 


which we know has a solution which gives rise to 
2) 
t= > Bt’ A(t’). 
v=l 
From this equation we have 


(2.8) > at = > do a,Bt°ae’) = DXY Dd asBurat” h(t”, 
p=l pel v=l1 m=1L dim 
the interchange of operations being justified by Theorem 2.1. From equation 
(2.8) and equation (2.2) (with A = 1), we get 
b, = » is AaBrnja ; 


din 


and this gives an effective solution of (2.6) provided we can solve (2.7) for the 


8’s in terms of the a’s. 


3. Some theorems on limits. As a basis for the discussion of the existence 
of a natural boundary for a function defined by a generalized Lambert series 
of the form (2.2), we prove two theorems on limits in this section. Thes 
theorems are generalizations of Theorems 2 and 3 of Hardy [3] for ordinary 


Lambert series. 
THeoreM 3.1. Let h(y) be a function which is analytic in the interior of the 

unit circle and which has on the unit circle a finite number of singularities of which 
a 

at least one is a pole,’ say at y = 1 and of order r. Let the series >> a, be sum 
n=1 

mable (C, p) with sum S and let d be a positive integer. Then, if y — 1 through 

real values less than 1, we have 


(3.1) lim >> ann"y"(1 — y)"h(y") = SA, 


yl n=l 
where A is the limit of (1 — y)'h(y) as y > 1. 


In the proof of this theorem we shall need a theorem proved by Bromwich 
[1] in 1908. For our purpose we state the theorem in the following form: 


3 If the pole is at z = e**'*, apply the rotation z = e***+y. 











to the 
> form 


uation 


for the 


istence 
- series 
These 


dinary 


of the 
f which 


€ SuM- 


through 


ymwieh 





A GENERALIZED LAMBERT SERIES 329 


If Zz a, is summable (C, p) with sum S and f,(y) is a function of y such that 


(1) > n?| A” f,(y) | < K (independent of y) ) 
< 
(2) lim n”f,(y) = 0 | Osy<l, 
n->2 / 
(3) lim f.(y) = A (independent of n), 
yl 


then >, @nfa(y) converges for 0 < y < 1, and lim Dd anfn(y) = SA. 
To apply this theorem to the limit in Thess 3.1 above, let 
fly) = n'y (1 — yy”) = n'y" +y ty t+---+y"") oy"), 
where g(y) = (1 — y)'A(y). Then condition (3) of Bromwich’s theorem is 
satisfied for lian f(y) = him g(y) = A, which is independent of n. Also, it is 


easy to show that condition (2) is satisfied. 
To show that condition (1) is satisfied we first show that A’*'f,(y) can be 
written 


(3.2) A’ faly) = (Lh — yy Savvy), 
where 
(3.3) Snvuly)| < B (independent of y for 0 S y S 1). 


This involves showing that lim f,,,,(y) is finite and independent of n, where 
yl 


. 

holy) = (1 — y)” Do (-1)" (?) (n+s)"(l+tyty+---ty""") gy") 

s=0 e . 

The proof of this statement is elementary but a bit tedious and is omitted here. 
If we use equation (3.2) with condition (3.3), condition (1) is implied by 


~ 


(3.4) > n”\ a" f,(y) | < BOL — y)”™ > n?y"", 


n=1 n=1 


where B is independent of y for 0 < y S$ 1. But >> n”y*" has a pole of order 
y ] 


n=l 
p+ 1aty = 1 as its only singularity in the interval 0 S y S 1 as is seen by 
inspection and simple induction from the relation 


in which the operator y - is applied p times. Hence, from (3.4) it follows that 
y 
there exists a constant K (independent of y for 0 < y S 1) such that 


Yn |arip(y) | < K. 














330 J. M. DOBBIE 
Therefore, Bromwich’s theorem can be applied to show that the limit jp 
(3.1) is SA. 


THEOREM 3.2. Let h(x) be defined as in Theorem 3.1 and let a be any point 
inside or on the unit circle which is not an essential singularity of h(x). Write 
h(x) = (a — 2x)°g(x), where g(x) is analytic at x = a and does not vanish there, 
Let | be any integer of the set 1, 2,3,---,k —1. Form the series 


F(p) = > 2 boue”” Mae”) 
v=0 


and let the coefficients by,+: be such that the p-th Cesdro sum formed from the series 
= (kv + 1)” by41 satisfies the relation BP, = o(v”*"). Thenlim (1 — p)* *F(p)= 
pl 


yv=0 
We can write lim (1 — p)’*F(p) as 


~~ 


a’ lim (1 — p) > (kv + 1)* devsa(v + U/k) (1 — p”*™")"(1 — p*) gap 


In the series above let B,,, = (kv + 1)*by,4, and let 
forlo) = (v + Vk — 1 — 0") %g(ap”*). 


An argument similar to that used in Theorem 3.1 can be made to show that 


APs, (o) = (1 — p)”**p“"O(1). Since p> Biaf.a(e) = > B? ,A”*'f,,.(p) and 
by hypothesis B?, = o(v”*"), the theorem follows from the latter part of Hardy's 


proof of his Theorem 3.‘ 


4. Existence of a natural boundary. 


THEeoreM 4.1. Let h(x) be defined as in Theorem 3.1. Take k and k’ as any 
two relatively prime positive integers and write x = e*""*''*. Let the points % 


(1 = 1, 2,---,k — 1) be poles of order r; S r — 1 of h(x). For d a positin 
integer form the series 
(4.1) H(z) = Do bax h(2”), 

n=1 


and let the coefficients b, be such that 
(i) the series bs (kv) "by, is summable (C, p) with sum S, and either 


v=) 
(ii) the series > (kv + 1) "Opa (Ll = 1, 2, --- , & — 1) are summable (C, p), 
v=0 
or 


* Hardy [3], pp. 196-197. 








nit in 


point 
Writ 
there. 


w that 


p) and ’ 


[ardy’s 


as any 
rinks % 
positive 








A GENERALIZED LAMBERT SERIES 331 


(ii’)’ the p-th Cesdro sums B?, formed from the series } (ky + 1) begs (Ll = 


v=0 
1, 2,---,& — 1) are such that BP; = o(v”™'). Then, for radial approach of x 
to to , we have the relation 


lim (1 — 2/20)"f(x) = SA, 


IZ 
where A is the limit of (1 — x)'h(x) asx—1. [If this is true for an infinite number 
of integers k, for each of which S = > (kv) “by, # 0, then f(x) cannot be continued 


v= 


analytically across the unit circle. 


Write z = pe’"“’'*. Then we must determine 
(4.2) lim (1 — p)’ D> bax" A(z"). 
pl n=1 


First, consider those terms in the series in (4.2) for which n is a multiple of 
kin = kv. Then x* = p‘ and the contribution to the limit in (4.2) for such 
values of nm becomes 


lim (1 — p)' Xe bux h(a) = 2 im (= : =s) D dir p*(1 — p*)"h(o’”) 
v=l1 


pl 


a ko lim ba bir Fags | ae p*)’h(p"”). 


pl v=1 


Let y = p’ and this limit becomes lim > aye? vy”(1 — y)h(y’) = SA by 
pl v=1 


Theorem 3.1. 

To complete the proof of Theorem 4.1 we must show that the contribution 
to the limit in (4.2) of those terms for which n = kv + 1 (1 = 1, 2,3, --- ,k — 1) 
iszero. We must consider 


lim (1 — 2/29)" p> Dever en h(x”) 


I~2p 


= 20 lim (1 — p)" “(1 — p)'*" > Divi p A(top”™). 
pl v= 
This limit is zero as we see from Theorem 3.2 by putting 2) = a, r; = —o, noting 
the hypothesis r — r; = 1. 
The statement concerning analytic continuation follows from the condition 
that SA + 0 together with the fact that the points 7 = ¢”"’’* form an every- 
where dense set on the circumference of the unit circle. 


‘That condition (ii’) is slightly more general than condition (ii) is proved by Hardy 
and Littlewood [4], p. 435, Theorem 14. 





on - 


332 J. M. DOBBIE 


REFERENCES 


T. J. a. Bromwicn, Mathematische Annalen, vol. 65(1908), pp. 350-369. 
M. C. Garvin, American Journal of Mathematics, vol. 58(1936), pp. 507-513. 


_G. H. Harpy, Proceedings of the London Mathematical Society, (2), vol. 13(1913), 


pp. 192-198. 


_ G.H. Harpy anp J. E. Lirr.ewoop, Proceedings of the London Mathematical Society, 


(2), vol. 11(1912), pp. 411-478. 


. Konrap Knopp, Journal fiir die reine und angewandte Mathematik, vol. 142(1913), 


pp. 283-315. 


_ J. H. Lampert, Anlage zur Architectonik, oder Theorie des Einfachen und Ersten in der 


philosophischen und mathematischen Erkenntnis, vol. II, 1771, p. 507 (§875). 


UNIVERSITY OF ILLINOIS. 








or 








PERSYMMETRIC AND JACOBI DETERMINANT EXPRESSIONS FOR 
ORTHOGONAL POLYNOMIALS 


By Vivian EBERLE SPENCER 


Introduction. Work with orthogonal Tchebycheff polynomials (OP) has 
usually taken as its point of attack the notion of a weight function and the 
corresponding moment problem. In the study of OP two important expressions 
for them as determinants arise. The first, or persymmetric determinant ex- 
pression, is obtained by replacing the elements of the last row of a certain 
positive persymmetric determinant by powers of x; the second, or Jacobi deter- 
minant expression, results when the characteristic determinant of a certain 
Jacobi matrix is written. The present paper undertakes a study of orthogonal 
and related persymmetric polynomials from the standpoint of the theory of 
matrices and determinants. In all fundamental theory the notion of a weight 
function is entirely avoided. 

The persymmetric determinant expression leads to a classification of sequences 
of these polynomials into sets S, each of which is found to contain one and only 
one symmetric sequence. Properties of sets S are investigated. In considering 
the Jacobi determinant expression we come upon certain finite sequences’ 
{0,(x)}7 of orthogonal polynomials which associate themselves in a very simple 
manner with any sequence of OP {#,(zr)}. The study of {6;(z)}7 leads to new 
bounds for the zeros of ®,(z). Properties of the continued fraction associated 
with {0,(z)}; are obtained. Combining these results, we are led to a set of 
theorems regarding a formally defined interval of orthogonality for { ®,(z)}. 
A theorem of Krein’ has important bearing on our study. We close with its 
extension. Throughout the paper applications are made to the classical orthog- 
onal polynomials. 

For OP we use the notation adopted by J. Shohat.’ 


1. Aclassification of sequences of persymmetric polynomials. A persymmetric 
or Hankel determinant is a determinant in which each line perpendicular to the 


Received December 20, 1938; section 1 of this paper was presented to the American 
Mathematical Society, September 13, 1935. The author wishes to express her gratitude 
to Professor J. Shohat for his highly-valued interest and suggestions throughout the prepa- 
ration of this paper. 

' By the notation |q;}} will be understood the sequence {q;} (¢ = 1, 2, --- , ). 

?M. Krein, Uber das Spektrum der Jacobischen Form in Verbindung mit der Theorie der 
Torsionsschwingungen von Walzen (in Russian), Rec. Math. Moscou, vol. 40(1933), pp. 
455-465. 

*J. Shohat, Théorie Générale des Polynomes Orthogonaux de Tchebichef, Mémorial des 
Sciences Math., Fasc. 66(1934). 


333 











334 VIVIAN EBERLE SPENCER 


principal diagonal has all its elements alike. Thus, 


ao @ *** An-1 
(1) Ag =| % =o *°* Gh | am lassj]o (Ao = 1, Ai = a) 
Qn-1 Qn *** Gen-2 


is a persymmetric determinant of order n. A, is determined when the 2n — | 
elements of the principal and one adjacent minor diagonal are given. We shall 
assume only that A, # 0 (n = 1, 2, --- ), a broader assumption than A, > 0, 
which leads to OP.* 

Consider the sequence of polynomials { ®,(x)}, where 


em s 
a > ee 

(2) ®,(z) = > Rs "Soe eee “oes (n = 1,2, «++; =1), 
; Qn On Oen—1 
1 Ss os 6S 


By elementary transformations (2) may be reduced to the form 


a, — Tay Qe — Tay, eee Qn — Tan 
(—1)" 
(3) ®,,(x) = . es Ta — Tae “ere Qnit ~~ Tan 
An 
Qn — Tan-1 An+1 — Lan pia Q2n—1 — TA2en-2 


a multiple of a persymmetric determinant. Hence, we shall call the ®,(z) 
persymmetric polynomials. 

In substance Jacobi’ first proved that the polynomials of {#,(x)} satisfy the 
recurrence relation 


” ” 
An Bau An-24n ‘ 
(4) ®,(z) = (« - a ‘) , (x) — —— ®, (xr) (n= 2,3,---), 
An Ana-1 ). 
where 
ao a) sige An-2 An 
a) a2 sins Qn-1 An+1 
” ” ” 
(5) A, = 60% eee vr oes eee (mn = 2,3, ---;Ao = 0, Ai = a). 
Qn—-2 GAn-1 = G2n—4 A2n-2 
Qn-1 An lite Aen—3 4 A2n-1 


‘ Shohat has shown that polynomials (2) with the broader condition A, # 0 lead toa 
generalization of orthogonal polynomials (Comptes Rendus, vol. 207(1938), pp. 556-558). 

5 C. G. J. Jacobi, De eliminatione variabilis e duabus aequationibus algebraicis, Journal 
fiir Mathematik, vol. 15(1836), pp. 101-124. 





Q) 


— | 
hall 
> 0, 








EXPRESSIONS FOR ORTHOGONAL POLYNOMIALS 335 


A direct derivation of (4) from (2) may be carried through, for example, by the 
method of Muir and Metzler.” 


Write 
A. Bon 
Cy _ a ; (n = 1, 2, ), 
n n—l 
6) And 
Nn - - n (n = 2, 3, eee Ai = ay = A) 
n—l 


Then (4) may be written as 


(2) = (2 — Ca) Pn_a(T) — AnPn_2(z) 
(n = 2,3, --- ; Bo(x) = 1, & (xz) = x — a). 


For all X, > 0, (7) is known to be characteristic for OP. 

Given any sequence of numbers {a,}, an associated sequence of determinants 
{4,} and an associated sequence of polynomials { #,(xz)} are determined. De- 
note the sequence of numerical values of the determinants A, by {6,}. Let us 
investigate the properties of sets of sequences { #,(z)} with each of which is 
associated the same sequence of numbers {4,} by allowing the a, to vary in such 
away that {6,} remains fixed. 

DEFINITION. A set S is a set of sequences {| ®,(x)} with each of which is 
associated the same sequence of numbers {4,}. 

The a, will be called moments associated with | ®,(x)}, and {a,} will be said 
togenerate {| ,(2)}. 

Consider any two such sequences {;®,(z)} and |,,(z)} of the same set S 
to which correspond the sequences of determinants {,;4,} and {,A,}. {An} is 
obtainable from {,A,} by multiplying the matrices of |,A,} by matrices whose 
tlements may or may not depend on |{,a,}. Obviously, the numerical value 
of the product of these matrix multipliers must be one. We may write the 
product of all matrix multipliers on the left and right of ;A, respectively as 
¢ 17, and q7,,, where q is a constant (¥ 0) and ,7,, and 7, are unit matrices. 
Moreover, we may assume ;7', and 7’, reduced to the form 


1 aie aiz3 eee in 

0 1 A23 ts; ih 
(8) 

0 0 1 Azn 

0 0 0 see 1 


For the first row and first column of ,7', ;A, 7, to be identical, it is necessary 
’ +f ’ 
that ;7,, = T,,, the transpose of 7',. Hence, the elements of the sequence of 


‘Muir and Metzler, Theory of Determinants, 1930, p. 433. 











336 VIVIAN EBERLE SPENCER 


determinants associated with any two sequences {;®,(x)} and {,@,(xr)} of the 
same set S satisfy the relation 
4 +f ‘yy ry . 
(9) A, = T, dT: (T,, given by (8)) 
T,, shall be said to carry |;#,(r)} into {,@,(z)}. 
Consider first the case where the a;; in (8) do not depend on {a,}. 


THEOREM I. Let {;%,(x)} and |,®,(x)} be two sequences of a given set 8. If 
the sequence of matrices {T,}, independent of {an}, carries {;®,(x)} into {,&,(z)}, 
then 


(10) T,= 9 0 _ = (m arbitrary; n = 1, 2, ---), 


Moreover, if | :,(x)} and {,®,(x)} are so related, then, if the sequence of momenk 
associated with { ;,(x)} is |a,}, the sequence of moments associated with |,®,(z)} is 


2! 


Proof. Sufficiency. If the right member product in (9) be formed with 
(10), the resulting left member is the determinant’ 


° e ° ° n-l 
| m'*ay + (i + j)m a + @ tes —) mae ees + ai] : 


) 
{maa + nm" ‘a, + a(n 1) m" aa + +++ + an > (n = 0,1, -::). 


. . ™ . . ald 
Necessity. Assuming the a;; of (8) independent of |a,}, form 7, 4, T,= 
A;; ||. This is found to be 

ao Aya + a 
9 
Aya + ay Qi200 + Zana; + ar 


(11) 


Qj3 09 + Gegay + a2 Ayo Qyg3ao + (Ay223 + Ai3)ay + (Qy2 + Ge3)a2 + a3 


py! 


where, since A, = || ais;-2!', T.n = |} ai; ||, Te = || ji |!, 
(12) Ai = D ar D or 41-24; = p 2 D> aiayja +12 5 
k I k l 
but if (11) is persymmetric, Aj; = Ajis,)-1, or 
(13) Ai; = 2 2 Gsstytast g = bk > a; i+1 41, j;-1@h+1-2- 
k l k l 
Now compare coefficients, remembering that a;; = 1 and a;; = 0 if 7 > j. 


7 T. Muir, Ona property of persymmetric determinants, Messenger of Math., vol. 11(188! 
pp. 65-67. 





of 


of | 





(1881 





EXPRESSIONS FOR ORTHOGONAL POLYNOMIALS 337 


THEOREM II. {T7,,} defined by (10) carries any | ,(x)} into |®,(x — m)}. 


Proof. By elementary transformations it is readily shown that A,®,(x — m) 
is given by 


ao mag + a) mM ayo + 2may + a2 


May + ay mM ag + 2ma; + ae 


(14) m’ ay + 2ma; + ae 


1 x xr a 
Coro.ttary. If a set S contains a certain sequence | #,(x)}, it also contains 
any sequence obtainable from | ®,(x)} by an arbitrary displacement along the x-axis. 
Moreover, all sequences associated by means of a matrix of type (10) are so related." 


Let us now admit matrices (8) dependent upon {a,}. 


THEOREM III. Any given set S always contains a sequence |,,(x)} generated 
by a sequence of moments | ia,} whose odd elements have arbitrarily preassigned 
values. Moreover, if the odd elements are given, the even ones are uniquely deter- 
mined. 


Proof. Write the product 

(15) Tr An Tn = || Ai; || = An, 

where A,; is the linear function of {a,})°’ ° indicated in (13). We shall need 
Lemma 1. The determinant of coefficients of the system 

(16) Ain = Aown-1, Aon = Asnat, -*-, Anam = An-tnt, Ann = Kar 


(where K,_, is a constant), considered as a function of Qin , Gen, Gan, -** , On—i.n; 
An. 


Proof. Let L = L(t, te, --- , tn) denote a linear function of t,, te, ---, tn 
n—l 


of the form >, bit; + t,. Since the A;,,-1 are independent of ain, dan, ---, 
1 


*If all 5, > 0, then the moment problem {a,}f has at least one solution, i.e., there 
Ro 


exists a function ¥(z), bounded and non-decreasing in (— «, ~), such that [ x" dy(z) 
— 


=a,(n = 0,1, --- ). Here the new moments {may + nm™"'a, + --- + a@,} are evidently 


oo C2) 
given by | (x + m)"dy(x) = [ z"dy(z — m). Hence, in this case transforming A, by 
— 60 oo 

means of (10) corresponds to a displacement of |#,(z)| along the z-axis, and the invariance 


of {5,} is evident. 











338 VIVIAN EBERLE SPENCER 


nin, Write Ajn-1 = C;. Then, by (11), letting C; denote the sum of the 
terms of A; independent of din , don, --+ , @n—1.n, We may write (15) as 


Ly(a)ain + Li(ay)don + +--+ + Ly (@n-2)An-t.n = C2 — Ci. 
L2(ao , @)Qin + L(ay ’ G2) Aen t+ ::- + L2(an-2 , On—1)On—1,0 = (C; - C2, 


Ln-1(@0 , 1, +++ 5 On—2)Ain + Ln_s(an , 2, +++ , On—1)den + --- 

+ Dn-s(@n—2 , On-1, +++ » O2n—4) Onin = Ki. — Cual 
where L; denotes a particular linear function of type L. The determinant of 
the coefficients of the linear system (17) reduces by elementary transformations 
to Be..3 « 

Returning now to the proof of Theorem III, assume the a;; of (8) are depend- 
ent on ja;}. Let the arbitrary preassigned sequence of values for the odd 
elements of {,a,} be denoted by {K;} (¢ = 1, 2,---). First, consider (15) 
for n = 2. From (11) we see that 7:427¢ is a function of a, alone. Our con- 
ditions give Aw = An = K,, a linear equation in ay. with the coefficient of ay 
different from zero. Moreover, if we solve the latter, Az is uniquely determined. 
Hence, the theorem is true for ;a; and ja. Assume it true for { 1a}3"", that 
is, for every ,@; in ,A,_,. Then we may consider A, as a function of the n — | 


unknOWNS Gin , Gen, G3n. *-- , dnt.» , these to be determined by the conditions 
that ,A, be persymmetric and that jae,_3 = K,;. From (11), A, issymmetric; 
hence, our conditions lead to the n — 1 linear equations (16) for these un- 


knowns. Since we have made the fundamental assumption that no 6; be zero, 
as an immediate consequence of Lemma 1, a unique solution of (16) for aj,, 
flan, Qsn,**+, @n—t.n exists. Moreover, these a,;; determine A,, uniquely. 
The theorem now follows by induction. 

DerFINITION. A sequence of persymmetric polynomials { ,(z)} is said to be 


a symmetric sequence if 

(18) ,(—xr) = (—1)"4,(z) (n = 1, 2, ---), 
that is, if 

(19) ®,() = 2" + yet” + haat” +---+hiz,  orho. 


Lemma 2. A necessary and sufficient condition that {,(x)} be a symmetric 
sequence is that all the odd moments |a;} (¢ = 1, 3, 5, --- ) be zero. 


Proof. Sufficiency. By inspection of (2), under the assumption that a; = 0 
(i = 1,3, --- ), wesee that the coefficient of x" “ (k = 1,3, --- ) isa determinant 
such that the column in which any non-zero element lies contains either one more 
or one less non-zero element than the row in which it lies. By suitable inter- 
changes of rows and columns and expansion by Laplace’s method, this deter- 
minant is seen to vanish. 


f 











the 


a 
“ly 


nf 
42, 


vetric 


nant 
more 
nter- 
leter- 








EXPRESSIONS FOR ORTHOGONAL POLYNOMIALS 339 


Necessity. We have 

1 |a@ «aQ 

(20) $,(z) => — 
ao l 

Hence, if #,(z) is symmetric, then a; = 0. The proof may be completed by 

induction. 


Combining this lemma with Theorem III, we have 
Coro_uaRyY 1. A set S contains one and only one symmetric sequence. 


This sequence will be denoted by }|,®,(x)} and the associated moments by 


{ an}. 


CorROLLARY 2. Any given set S contains one and only one sequence { ®,(zx)} 
for which \c,}, the sequence of constants defined in (6), takes an arbitrarily pre- 
assigned set of values. 


Proof. From (6), 


a a) Qi-2 ay 
. of ” 
(21) ¢ = a = oes = ani Ait + i a) ae Qi-1 = Qin} — Ai-n 
A A Ay vs Fowl 
Bi. ay a2 —~3 0 


Here Ay, , A; # O, and aen_;A;-1/A; is the only term containing a2;_; ; hence, 
¢;may be made to take any arbitrarily preassigned value by a suitable choice 
of aa;-1, and, by Theorem III, the latter may be chosen arbitrarily. More- 
over, by the same theorem the odd elements of {a,} determine its even elements, 
hence the {c;}, uniquely in S. 

Remark. Corollary 1 may be obtained as an immediate consequence of 
Corollary 2, since the definition of a symmetric sequence, combined with the 
recurrence relation (7), leads at once to the conclusion that for such a sequence 
g=O(n =1,2,--- ). 

Explicit expression for {an}. 
then a2; = 0. 


For brevity write ,a; = a; (¢ = 0,1, --- ), 
By a suitable interchange of rows and columns and Laplace 


expansion 
ao a2 a) ae as Am 
(22) A, =| @ ay i+2 a4 a6 Am+2 
Qi @iy2 °°" te Gm  Om+2 °** em 


where 1, m are respectively double the largest integer in 4(n — 1) and 4n. Suc- 
cessive application of (22) enables us to write a, in terms of the {6,} determining 











340 VIVIAN EBERLE SPENCER 


¥ n 1 rie 
S, and the fa;}; Thus, 
2 
be b3 , ae 
a=—-, a= t+-, , 
a bo a 
‘ ar Ar+2 25 Am ar Ar+2 ho Am—2 
(23) D 
Qen—2 6, Ar +2 Ar+4 mes Am+2 = Ar+2 Or+4 te Am 
“s ’ 
bn —1 | 
Am Qm+2 *** 0 Am—2 Am ey Cons | 


where r = 0 for n odd and 2 for n even; and m is double the largest integer in }n. 

Let us next investigate whether the even elements of {a,} can be chosen 
arbitrarily. 

THEOREM IV. Associated with each set S there exists a set of intervals {3;| 
such that the set S contains a sequence |, ?,(x)} generated by a sequence of moments 
|:a@,} whose even elements | ,a;} (t = 2, 4, --- ) have respectively values arbitrarily 
preassigned within the intervals |¥;}. Moreover, if the even moments are given, 
the odd moments are determined as two-valued functions of these even moments. 

Proof. The proof is analogous to that of Theorem III. Consider the arbi- 
trary sequence of even moments {K;} (i = 1, 2, --- ), and the product 7", 4,7, 
written in the form (15). For n = 2 the conditions of our theorem require 
Ag = K,, a quadratic equation in a2. K, is restricted to values which will 
make the solution of this equation real. Hence, 

K,2 oe = os ’ 
a ao 
and 3; = (6:/ao, ~). Moreover, given K, , aj: is determined as the solution 
of Ae = K,, hence, as a two-valued function of K, , and ja; (= avai. + a) is 
a linear function of aj... We complete the proof by induction. Assume the 
theorem to hold for the moments {,a;}$"*, that is, for every moment in ,A,1. 
Denote these moments by {B;}$""*. Then we may consider ,A, as a function of 
the n — 1 unknowns @, , don, --+ ,@n-1.n. To determine these, the conditions 
of the theorem give 


G0Ain + Aden + +--+ + On—2An-tyn + On-1 = Bi, 
Lao , 01) din + Lela , a2)don + --- 


+ L2(an—2 ’ Qn—1)An-1,n + L2(an—1 ’ Qn) = B, ’ 


(24) Ln-2(ao » U1, *** » Ay -3)@in + Ln-2(ar 9 U2, °**y Gn—2) Aon +:-- 
+ La~2(Gn > G@n-2 > °** 5 Q2n—5)An—1,n + Lin-2(@n i» Gay °** 5 Gon) ™ Bon 


Ln(ao , a1, +++ , @n-1)@in + Lp(a, a2, +++ , On)don + 


+ Linlan GS» Ga-8 5 *** » Gon—3) An ~l,n + L,. (an Bp Gay *** 5 Qten—2) = K,-1: 


Th 
giv 


(27 








ition 
x1) is 
> the 
An: 
on of 
tions 


But) 


Son ’ 


K,-1- 





EXPRESSIONS FOR ORTHOGONAL POLYNOMIALS 341 


The first n — 2 of these equations may be written as 


L,(ao 9 Gap °** 5 Q@n—1) = Ba-1 ’ 
(25) Lnlay, a2, +--+ , an) = B, — apB,-1, 
L,.(an- 3, Gn-2,°°* » Aen 4) = Ben 4 — Ain 2B,, ——s 


a Go-3.0-0h Bes Bega -). 


Denote the constants on the right of (25) by 'D;}2"7*, and combine with the 
last equation of (24). We obtain 


D, 141» + Dy, dan + oh tn -+- Dey ~44n_2,n 
+ (an 2Q1n + Qn—1 en + violas + Gen—44n—-1,n + Qon—3) An -1,n 
+ (an -1 in + An Aen + e? + Q2n—3 An—-1,n + G2n—2) - K, -l- 


Substituting in this the solution of (25) for ain, Gen, «++ , @n—2,n (this solution 
necessarily exists, since the determinant of the coefficients is 4,2 ~ 0) in terms 
of a,1,, and simplifying, we obtain 


2 ” 
An—14n—-1, + 2An-1 An-1,n 


ao heli An-3 An-1 ao fie An-3 Di 
) _— | 
(26 + la, 3 *** Qen6 Gen—4 Qn *** Qan6 Deans 
Qn-1 x" Gen—4 AIn-2 Dinws mae Dens 0 
— je oa = 0. 


The condition that the discriminant of this last equation be non-negative 
gives, when the determinants are combined by means of the Studni¢ka ex- 
pansion,” 


ao stv An-3 Dy 
f 4 bn ] ; — . 
(27) Reus = a = aF Qn-3 . om Qon-6 Dens = E, Sn = (E, oo), 
Dri -:: Daw 29 


By (26), an_1., is determined as a two-valued function of K,_;, and thus we 
have 


ns = Ly s(ao, a1, ++ , n-2)in + +--- + Lin-1(@n-2 » An—-1, *** » Bin 4) An-1,n 
aa La—1(Qn—1 9 Many °** 5 G2n—3) 
isa linear function of a), , @on, --~ » @n—1.»- 


*The form of the Studni¢ka expansion here used is shown explicitly in (28’) below. 
See, for example, E. Pascal, Die Determinanten, 1900, pp. 39-40. 











342 VIVIAN EBERLE SPENCER 


In order to derive an important corollary of this theorem, we need 


Lemma 3. If A; > 0(¢ = 1, 2, --- ), then for |D;} arbitrary 


im + mee Bia 

(28) Qn3 *** n6 Dons = 0 (n = 3,4, +++). 
Das st it Den-4 0 

Proof. The lemma obviously holds for n = 3. Assume it holds for n = 


k— 1. By the Studni¢ka expansion, we have 


ag th An-3 Dy 
Anos An-3 — A2n—6 Dens 
. Dy-1 = Den 0 
(28’) , . 
ao =e An-4 Diya ao — An-4 Dra 
= _ < 
3. “2 An—4 aay G2n—8 Dons an cx ey A2n-8 Den—s = 0, 
Di ia Dens 0 A@n—5 °° * @2n-7 Dons 


and the lemma follows by induction. 
Hence, as a corollary of Theorem IV, we have 


Corotiary. If A; > 0(¢ = 1,2, --- ), then 


An+i 
(29 nm = (n = 1, 2, -=:). 
) aon = A. n 


We are thus led naturally to study the class of all persymmetric polynomials 
belonging to sets S satisfying the condition A, > 0. Hamburger’ has shown 
that A, > 0, for every i, is a necessary and sufficient condition that the moment 


problem in the interval (— «©, ©) have at least one solution with infinitely 


many points of increase. In this case, as noted above (footnote 8), the moments 
wo 


are defined as a; = [ x’ dy(zx), and the associated polynomials are OP. The 
oo 


corollary has thus given, without the introduction of the notion of a weight 
function, or of an interval of orthogonality, or of mechanical quadratures, that 
for OP the even moments are positive, and in addition has given us a lower 
bound for these moments. A further application to OP follows. 

The normalizing factors corresponding to any sequence {| #,(zr)} of OP are’ 


10H. Hamburger, Uber eine Erweiterung des Stieltjesschen Momentenproblem, I, II, Ill, 
Math. Annalen, vol. 81(1920), pp. 235-319; vol. 82(1921), pp. 120-164, 168-187. See also 
M. Riesz, Sur le probleme des moments, III, Arkiv f. Mat., Astron., och Fys., vol. 17, no. 16 
(1922-23). 


1 J. Shohat, loc. cit. 





se 








lA 
_—) 


mials 
nown 
ment 
ritely 
nents 


The 
eight 


that 
lower 


are. 
[, Ill, 


e also 
no. 16 








EXPRESSIONS FOR ORTHOGONAL POLYNOMIALS 343 


a, = (5n/dns1)’. Hence, the set {a,} is an invariant of any set S made up of OP. 
Moreover, (29) gives here 
(29’) a, 2 a; ; a, = ary (n = 1, 2,3, ---). 
Thus, for Hermite and Laguerre polynomials we have respectively :” 

a, = (["(n+4)]? and a, = [T(2n+ a)}73, (n = 1, 2, 3, --- ). 


Returning now to the recurrence relation (7) and the general persymmetric 
polynomials, as a consequence of (6), we see that the sequence {X,} is an in- 
variant of the set S. Thus, for any sequence {,#,(x)} of a given set S, (7) may 
be written as 


«®,(z) ” (x — kn) &kPn—i(2) ‘aol An ~®p_2(2) 
(n = 2,3, --+ 52% = 1, .Pi(z) = x — 4c), 


where A, is invariant in S, and only {,c,} varies with {,#,(z)}. In particular, 
the recurrence relation for the symmetric sequence {,®,(z)} of S is 


(31) s®,(z) = x .P,_1(x) — An s®,_2(z) (n _ 2, 3, stb, ; »Pi(z) = zr). 


The recurrence relation (30) leads to another representation of any .®,(z) 
of a given set S,” 


Z=— Li de 0 eee 0 0 


1 t= — xe As 


= 
ww 
> 
° 


(32) .&,(z) = 


0 0 0 ses DB Rn An 


0 0 0 tee 1 Z— in| 
where the A; are invariants in S. If A; > 0 (i = 2, 3, --- ), dropping the sub- 
script k, we may write (32) in the symmetric form™ (n = 1, 2, --- ) 
tI— Cj rb 0 
3 E— i} 
3) &(z)=| 9 MM F-& 0 . 
Zt — Ca-1 ri 
0 vee vb I— Cp 


'? The exact values for a, in the respective cases are a, = 2'"[x!I'(n + 1)]}-3 and a, = 
Ir(n + 1)P'(n + a)}-3; see J. Shohat, loc. cit., p. 30. 

0. Perron, Die Lehre von den Kettenbriichen, 1913, p. 11. 

“0. Bottema, Die Nullstellen gewisser durch Rekursionsformeln definierten Polynome, 
Akad. Amsterdam, Proc. Sec. Se., vol. 34(1931), pp. 681-691. 





344 VIVIAN EBERLE SPENCER 


Moreover, A; > 0 (¢ = 1, 2, --- ) implies A; > 0 (¢ = 2, 3, --- ),° and hence, 
we are again led naturally to consider OP. 
We now turn to a study of expression (33) for ®,(2). 


2. The Jacobi determinant expression for orthogonal polynomials; associated 
orthogonal polynomials. By definition, a Jacobi matrix || a;; |, is a matrix in 
which a;; = a; anda,;; = 0Oifi <j — 1. The representation (33) then leads 
at once to 

THeoreEM V. Any OP #,(x) is (—1)" times the characteristic function of the 
Jacobi matrix |\ a;; ||, where aj, = ¢;, and a;,i4. = = 

Moreover, the matrix of the determinant (33) is itself a Jacobi matrix, and 
hence, we shall call this representation of ,(x) the Jacobi determinant expression 
for ,(z). We shall call the representation (2) for any OP ®,(2x) the persym- 
metric determinant expression for ®,(x). 

The expression (33) for #,(2) as a Jacobi determinant is not unique. Since 
the sets {c,}7 and {A;}2 constitute a set of 2n — 1 constants upon which are 
imposed only the condition 4; > 0 (¢ = 2, 3, --- , n) and the n conditions that 
certain functions of them equal numerically the coefficients of the powers of r 
in ,(z), in general there are «"~' ways of expressing ®,(x) as a Jacobi deter- 
minant of type (33). Thus, for 

r-c Ae 


(x) = 2 + d,zx +d) = ’ 


€,, C2, and A, are determined in terms of dy and d,; by means of the relations: 
Ci + Ce = —d, , €1C2 > do and de = (;}Ce — dy . 

THeoreM VI. Given 6 > O and n, there is a sequence of OP | ,(x)} such 
that the zeros of ,(x) differ from any sequence of constants |c;}{ by less than 6. 


Proof. Introduce a sequence of OP {7,(zx)}: 


zI-t é 0 ee 0 
é I—Ce é 0 
atiat © & 2-6; 0 
(34) 0 0 0 1s 2—to ot 
0 0 0 oes é! t— Cn 
= Il (x tang c;) + eP,-2(2) + é P,4(z) + + ae n(x) + eR(z). 
1 


When eis chosen sufficiently small, the theorem is obvious if we let 7,(x) = ®,(2). 


16 If we impose the added condition \; > 0, then conversely A; > 0 (é = 1, 2, --- ) implies 
A: > 0 (i = 1, 2, --- ). 

















whe 
non 


(37) 
and 








ns: 


2(z). 
n(2). 


plies 








EXPRESSIONS FOR ORTHOGONAL POLYNOMIALS 345 


(35) TABLE OF CONSTANTS ¢, (nm 2 1) AND A, (n 2 2) 
Trigonometric polynomials 


c, = 0 An = 4 (n > 2), rx = } 


] 


Legendre polynomials 


(n — 1)° 


1 = 0 An = 
(2n — 1)(2n — 3) 


Hermite polynomials 


ix) 
2 

II 
—) 
> 
3 

ll 


t(n — 1) 
Laguerre polynomials 


C, = 2n+a-—2 An 


(n — 1)(n + @ — 2) (a > 0) 
Jacobi polynomials in (—1, 1) with (a, 8) > 0 


- (a — 8)(a + B — 2) 
(a + B + 2n — 2)(a + B + 2n — 4) 


_— 4(n — I)(n +a +B — 3)(n +a — 2)(n + B — 2) 
"" (Qn+a+B —3)(Qn+a+ 6 — 4)?(2n+a+ 8 — 5) 


Cn 


Given any positive integer n, we associate with ®,(z) and any of its repre- 
sentations (33) the system {6,(z)}¢ of polynomials defined as follows: 


Zt — Ca-1 vt 
Go(z, n) = 1, O,(z, n) =  — en, 6.(z,n) = ; q 
An Z— Ca! 
(36) z= Ca2 vt, 0 
0;(z, n) = o.6 2 esy Mi, vee, 6,(z, n) = ®,(z), 
0 rt l— Cn 
where 6,(z) = O,(z, n). The 0,(x) satisfy the recurrence relation 


(x) = (x — ¢:)O:1(x) — dj O;-2(z) 
(¢ = 2,3, ---,m; Oo(xz) = 1, O(z) = F — Ca), 


where c, = Canina, XC = An-vze2 > O (i = 2,3,---,n). The O,(z, n) are de- 
hominators of the successive convergents of the continued fraction 
| An | = Ant}, de | 


; = (A > 0 and arbitrary). 
il — Cy | 2 — Ca-1 |X — Che |r— Cy 


(37) shows that {0,(x)}9 represents a finite sequence of orthogonal polynomials, 
and we have 








346 VIVIAN EBERLE SPENCER 


TuHeoreM VII. Corresponding to any representation of a polynomial ®,(z) ag 
a Jacobi determinant of type (33) there exist two finite systems of orthogonal poly- 
nomials {;(x)}3 and {0,(x)}¢ , where” 


Ss = & ¥} 
(zx) = 1, (x) =x — «4, $.(x) = , : 
de L— Ce 
s=-@& 3 0 
$;(z) = zt — Ce 3 ’ ’ 
0 Nj I— Cs 
rt— Cn—} ri 
Q(z) = 1, @,(z) = z — ea, @.(z) = , ; 
An t— Cn 
(38) Zt — Cn-2 An 1 0 
@3(z) = - Z=— Ca—i rb ’ eilaeh 6,,(z) = &,(z). 
0 vb I — Cn} 


The sequence of polynomials {0,(z)}¢ when considered in connection with 
| @,(z)} shall be called a sequence of associated orthogonal polynomials. 

For the two sequences {0,(z)}¢ and {#,(x)}¢ to coincide it is easily seen that 
the necessary and sufficient condition is: ¢; = Cris: (¢ = 1, 2,---,m) and 
hi = Anise (¢ = 2,3,---,m). In order that these relations hold for every a, 
it is necessary that c; = constant c (¢ = 1, 2, --- ) and A; = constant A > 0 
(¢ = 2,3, --- ), i.e., that the polynomials satisfy the recurrence relation 


$,(z) = (x — c)%,1(z) — ASp_2(z). 


The sequence satisfying this relation is 
{v*" (sin (n + 1) are cos 3A “(x — c))/sin are cos }*(x — c)}9. 
In particular, when c = 0 and A = }, this reduces to the classical sequence 
{2~" (sin (n + 1) are cos x)/sin are cos zr} . 
Since a sequence of OP which is determined by a recurrence relation of type 
(7) forms a Sturm Chain,” it follows that 


(1) all polynomials (38) have real and distinct zeros; 
(2) the zeros of ®,(z) = 0,(xz) are separated by the zeros of 0,_:(z); 


16 From known properties of the weight functions corresponding to such finite systems 
(Hamburger, loc. cit.) we conclude the weight functions ¥:(z) and y2(z) corresponding 
respectively to {#;(z)}j and {@;(z)}$ are step-functions taking exactly n + 1 distinct 
values in the interval (—«, ©). The points of discontinuity of ¥,(z) and y2(z) coincide 
with the zeros of ,(z); and the sum of the saltus is the same for y;(z) and y2(z). 

17 For the properties of a Sturm chain see, for example, J.-A. Serret, Cours d’ Algébre 
Supérieure, Tome I, 1885, pp. 276-305. 








r) ag 
Doly- 


type 


stems 
nding 
stinct 
incide 


Igebre 


EXPRESSIONS FOR ORTHOGONAL POLYNOMIALS 347 


(3) the zeros of 6,(x) are separated by the zeros of 0;:(z) (7 = 2, 3, --- , n); 

(4) the zeros of 6,(x) (j = 1, 2, --- ,m — 1) lie within the interval formed 
by the extreme zeros of ®,(7). 

Let {x,;} and {y;;} (¢ = 1, 2, --- ,j) denote respectively the zeros of ©;(z) 
and @,(z) arranged in each case in increasing order of magnitude. 

Then, considering 0;(z, n) = x — c,, and recalling 2in < Yipn-1 < Yin-2 < 

», Inn > Yn-1,n-1 > Yn-2,n-2 > +--+ , we have 


(39) fin, < Cs S Ben (j = 1,2, ---,m). 


Likewise, considering 62(z7, n) = 2 — (Cn + Cn-1)2 + CaCn1 — An, We Obtain 
in a very simple manner 


¢; + on — [(e; - ¢j—1)" + 4n,]' 


Tin < 5 
(40) . : 
C; + cj- C; — Cj-1) + 4); 
in > TS inv + il (j = 2,3, ,n), 
c; + ¢j- \ c Cj- \ ° 
(41) tn < EE — yh, Sun > ito + a, (j = 2,3, ---,n). 
In particular, if c; and A; , for? = 1, 2,---,n andi = 2, 3, --- , m, respec- 
tively, attain their maxima cw = cw(n) and Aw = Aw(n) for the same 7 = M, 
and their minima c,, and A,, for the same 7 = m, then (41) gives 
(42) Zin <Cm-1— Am, Tan > Cua + Aw 
Ifc, —~ cand, — A as n — , then since lim z;, and lim z,, are known to 
no n--3o 
exist, 
(43) lim zn Sc — X, lim tan > 0+. 
n—-o n-?o 


Denote by F; the i-th convergent of the continued fraction 


re 2 ae LS 
2—1 |2k—-1 |2k—-1 
k k k 


Then it may be shown that F; < 0 for some7z S k + 2. 


(44) (k = 2). 


TueoreM VIII. Assume A; — An-~ > O for any k < n — 2 and for every 
i=n—-k+1,n—k + 2,.---,n, and either 
Case I. cy — Cn—z-1 > 0, or 
Case II. Cari — ci > 0, 
foreveryi=n—k,n—k+1,---,n; then 
k-1 
; 


k—1.4 
(45) Sun > Conds + : at, Or Zin < Ca—2-1 — : k ods 


respectively. 








348 VIVIAN EBERLE SPENCER 


Proof. Consider Case I. In 0,42(z, n) substitute 


2k — 1 
t= Ce-e-1 + Mia = om. 
k 
Ox+2(0% 9 n) = | ai; |, 
where 
2k — 1 
ai = k NL, = (Ca—z+i-2 = Ca—z-1), 


Qiivi = Gui = ae aij = 0, (i — j * 0, +1). 


Subtract k(2k — 1)” times the first column of a;; | from the second. Then 
| a;;| can be written as (2k — 1)k"A‘_, times a determinant of lower order. 
Denote the elements of the resulting second column by az; (j = 1,2, --- ,k+ 2). 
Subtract (a)~'A4_.,, times this column from the third. Then | a;;| can be 
written as az times a determinant of lower order, where 


’ 2 — 1 k 

agg = ( k - _— ok vo 1, a _ (Cn—« pon Cn—k-1) 

1 e 1 | ; $ . ph 

e: — ok - 1 An—k = Fe Ag. 
k k 


< 


Similarly, denote the elements of the resulting third column by 4; 
(j = 1,2, --- ,& + 2), and subtract (a33) '\! _..> times this third column from 
the fourth. Then, 


» 2-1 1 | 1 % a 
a33 = — k ri, — An—b41 ok — i — io = rt, — (Cn — od] 
k he J 
1 1 | l - } mee Ts 
— (Ca—k41 — Ca—k— _— — . Anne 2 Fs Soe 
(Cn—te1 — nts) < (oe — oR ~ een dat 
k k k 
In general 
aj; < Fi’, (i = 2,3, ---,k +2). 


Since F; < 0 for some i S k + 2, some element in the sequence ay, ass , 
Sr 4.42.42 must be negative. Denote the first negative element in this 
sequence by a,,. Consider the determinant obtainable from | a;; | by striking 
out its last k + 2 — g rows and columns. This determinant may be written 
as (0, ,n), where (2, n) is a polynomial of the type ®,(z, n) associated with 
Ox42(z, n) in the manner discussed in Theorem VII. The above method factors 
®)(v,, n) into anagea33 --- a,, < 0. Hence, #,(2, n) = 0 for some z > % 
and hence also yes2n12 > ve. Since 2nn > Yeso,212, the theorem follows for 
Case I. 











EXPRESSIONS FOR ORTHOGONAL POLYNOMIALS 349 


By our substituting r = cn... — (2k — Ik ‘hin Ox42(z, n), a similar 
argument leads, in Case II, to the upper bound for 2, . 

Coroutiary. Let, forn — ~, dr, — A and c, — ¢, and let every 4 < » and 

Case I. every ci < ¢, or 

Case II. every c; > ¢; 


then 

(46) lim Zan [0 +2 or lim ay, Sc — 2d’, 
n-?o n-?o 

respectively. 


THEOREM IX. With the notation of (42), 
n+ 1 


(47) Tan <u + Ww — “gee Ab 
1l.; 
(48) in > Om — 20 + MPa. 


Proof. In (33) substitute r = cw + au, — e« =wu,. We obtain 
,(wr) = | ai; |, 
where 
Qi = Cu — Co + QW — €, Gi. @ Cass @ As, a;; = 0 
(« —j #0, +1). 


Subtract (ann) ‘\! times the n-th column of | a;;| from the (n — 1)-th. Then 
we have as a factor of | aj; 


Cu — tn + Dy —€e>0 ife < 2a. 
Denote the elements of the resulting (n — 1)-th column by ag4,; (j 
1,2,---,n). Subtract (a, a '\! | times the resulting (n — 1)-th column 
of |a;;| from the (n — 2)-th. Then as a factor of | a;;| we have 
’ H An 4 
Onin = Cu — Ca + 2Ay — € — i > 2rAw —e 
Cu — Cn - Q2r\\y ™ ~@ 
d 2, — «> —A : ; 
- — > (Arm ? “>pl—2%>0, ife < Pr‘. 
Qr ‘iv —.9 2r\v 


Continuing this process and using primes to indicate the elements of the altered 
columns, we obtain 


(49) Gi; | = Gan @a—1,n-1O0—2,0-2 °° * Gn, 
where @p_isinti1 > (k + 1k AL — bk te > O if e < (kK + 1dr AL 
(, = 2**' — k — 2). Hence, for any « < (n + 1)2™" '\4, we have expressed 


/a;;| as a product of positive factors, and consequently ta, < w,. The lower 
bound for x;, is obtainable in a similar manner. 








350 VIVIAN EBERLE SPENCER 


Corotuary. If cy(n) — ¢ and Ay(n) > X, as n > &, c and X finite, then 


(50) lim Zan Sc + 2d’. 


no ns 
Also if c,(n) — c’ asn — @, then 


(51) lim 2, = ce’ — 2a’. 


n-?o 


Combining (46), (50) and (51), we have lim z,, = ¢ + 2x! (c = lim cy(n)), 


n--2 n--2 
. A . 
and lim z,, = c’ — 2d’ (c’ = lim ¢,,(n)). 
no n--2 
Now let lim c, = c and lim A, = A. Select & such that c; < ¢ + € (i = 
no n--2 


k,k+1,---)andA; <A + e(@= k+1,k +2,---). Consider 6,_,,:(z, n), 
n arbitrary. Stieltjes has shown that between any two zeros of an OP @,(z) 
there lies at least one zero of ;,,(z) (r = 1, 2,---). Hence, relation (38) 
implies that at most k zeros of ®,(z) lie outside the interval containing all zeros 
of 0,_x::(z, n), yielding, since « may be chosen arbitrarily small, lim yn—x+41,.-«41 


n-?o 
i . . . 

(< ¢ + 2X’) as an upper bound for the interval in which the zeros of { ,(z)} 
may be everywhere dense. Moreover, as a consequence of Theorem VIII for 
n, k, andn — k— , the zeros of {@,-44:(2, n)}i=> are dense in the neighbor- 
hood of c + 2\*. A similar argument leads to the lower bound, and we have 

THeorEeM X. The zeros of {®,(x)} are nowhere dense outside the interval 
(c — 2r', ¢ + 2°), and are dense in the neighborhood of the end points of this 
. 1 
interval.” 

Expression (33) implies that ®,(x) is the characteristic function of the Jacobi 


form 
(52) Qu, ¥) = —Leyi +2 I Myya. 


Its spectrum, i.e., the zeros of ®,(z) in (33), has been investigated by Krein.” 


TueoreM XI (Theorem II of Krein). Jf the c; (j = 1, 2,---) and & 


(k = 2, 3, --- ) corresponding to sequences of OP ,(x) vary independently, then 

(53) OZin > 0 (i,k = 1,2, ---,n). 
OCy 

(54) In the sequence ie Oe .. . OF in (¢ = 1,2, ---,n) 


A2” OAs’ An 
18. J. Stieltjes, Recherches sur les fractions continues, Ann. Fae. Sci. Toulouse, vol. 8 
(1894), J, pp. 1-122; vol. 9(1895), A, pp. 1-47. 
19 O. Blumenthal (Ueber die Entwickelung einer willkiirlichen Funktion nach den Nennern 


0 
d 
des Kettenbruches fiir [ o(é) =, Thesis, Géttingen, 1898) has shown that if c, —~ c anda, >) 


9 
the interval in which the zeros of |#,(z)} are everywhere dense is exactly (c — 2d}, ¢ + 2A). 
20M. Krein, loc. cit. 








n)), 


for 
bor- 
ave 


rval 
this 


cobi 


ol. 8 


pnern 


or 


2n4). 





EXPRESSIONS FOR ORTHOGONAL POLYNOMIALS 351 


i — 1 terms have the sign + and n — ¢ terms the sign —, if suitable signs are 
affixed to possible zero terms. 
Coro.uary (Krein). 21, 7s a non-increasing function, and x,,,, a non-decreasing 
function, of 2, As, + *+ An? 
-- O21, Ox n » 
(55) <= 0, — (sg = 23, ---, wm). 
dr; Or; 


THEOREM XII (Theorem III of Krein). 


A 


i T i 
Cm — 2d COs <= Min S Cu — 2X3, cos 
7? 


T 
"Th Raita a n+1’ 


Cm + 2X3, cos SZ Inn S Cw + 2A COS 
? 


T 
n+1~— 


By use of the zeros of the sequence 


T 
i+. 


f x 
x!" sin (n + 1) are cos ~ a © / sin are cos c\ 
\ 2n! 2n} Jo 

this theorem may, of course, be obtained as an immediate corollary of Theo- 

rem XI. 

Krein’s upper bound for z,, and lower bound for 2, are better than the 
bounds obtained in Theorem IX. The latter theorem was included to indicate 
more fully how bounds for the zeros of #,(2) may be obtained from its Jacobi 
determinant expression by purely algebraic methods. When c,(n) # cu(n) 
and Aw»(n) # Aw(n), it is clear that for a given sequence of OP at most one of 
Krein’s bounds for each zero will be good for n large. But if, for example, 
tu(n) > c and Aw(n) — A, then Krein’s bounds together with Theorem VIII, 
for a suitably chosen k, lead to asymptotic expressions for z,,. Thus, for 
classical cases we have 

1. Hermite polynomials: Choose k = N((2n)'), where N(x) denotes the next 
integer > x. Then, from (45) and (56),” 


(57) (2n)' — $ + O(n!) < wtan = —wtin < (Qn)? + (Qn)? + O(n), 
L@., wlan = —aw2in = (2n)' + O(1). 
2. Laguerre polynomials: Choose k = N(n’). Then,” 
(58) 4n + 2a — 5n! + O11) < ttan < 4n + 2a — 5 + O(n 1 
ie, olan = 4n + 2a + O(n’). 
*" It has been shown that #z,,, = (2n + 1)! — 1.8557571(2n + 1)7'/® — 0.3443834(2n + 


1)-6 — (.168715(2n + 1)-8/? — 0.151965(2n + 1)-"/5 + O}(2n + 1)-"'*| (F. Zernike, Eine 
asymptotische Entwicklung fiir die grisste Nullstelle der Hermiteschen Polynome, Amsterdam 
Academy, Proce. of Sec. Se., vol. 34(1931), pp. 673-680) ; and that 1z,,, = 4n + 2a — 3.7115142 
(in + 2a)! + 2.7550676(4n + 2a)! + O(n) (V. E. Spencer, Asymptotic expressions for the 
zeros of generalized Laguerre polynomials and Weber functions, this Journal, vol. 3(1937), 
pp. 667-675). 








352 VIVIAN EBERLE SPENCER 


3. Properties of the zeros and intervals of orthogonality of sequences of OP 
in a given set S. As a consequence of Theorem XI, it follows that if two 
sequences | ;,(x7)} and |» ®,(x)} of a set S of OP satisfy the condition ,c; > ,,¢; 


(¢ = 1, 2,3, --- ), then the zeros of the polynomials of these sequences satisfy 
the inequality in 2 m%in (§ = 1, 2,---,n; n = 1, 2, 3,--- ). 


This inequality may be used to yield bounds for the zeros of the polynomials 
of any sequence of a set S of OP in terms of the zeros of the symmetric sequence 
{.®,(x)} of S. For, if {e;} correspond to {,(z)}o , in the same set S take 
{wP(x)}o and {mi(x)}o , where we; = cy and wc; = Cm (¢ = 0, 1, 2, ---). 
These sequences may be obtained from the symmetric sequence {,%,(r)}¢ of S 
by replacing z by x — cw and r — ¢,, , respectively, and hence the zeros {z;,,! 
of { &,(x)} and the zeros {,xi,} of {,&,(x)} satisfy the inequality 


(59) Sia TF Cn 5 fa S Bea + Cus (= 5 2,---,N). 


It is known that lim z;, and lim z,, both exist, finite or infinite. Define 


lim (tan — Zin) as the interval of orthogonality of {,(x)}.” 
TueoreM XIII. Any given set S of OP contains sequences | %,(x)} whose 
interval of orthogonality is (— ~, «). 


Proof. Choose {c,} such that lim c, = + and limc, = —«. This is 


n-?o@ n-?o 
possible by Theorem III, Corollary 2. By virtue of (39) the corresponding 
sequence of polynomials is one of the required { ®,(z)}. 


Coro.tiary. If {c,} is unbounded, then the interval of orthogonality of | ®,(zx)} 
is infinite.” 

TueoreM XIV. Every set S of OP such that |d,} is bounded contains se- 
quences {®,(zx)} for which the length of the corresponding interval of orthogonality 
(a, b) differs from any preassigned number | > 0 by a quantity d S 2x}, where 
dX = lim max (Ae, Az, --- , An). 


no 
Proof. Construct {c,} such that it contains two subsequences {c,,} and 
{en} such that lim A,,4: = lim A,,41 = A, lim e,, = é, lime,, = ¢, and 


(60) @= lime, =1—X', 
(61) c = lim c = ’. 


2 The above definition is equivalent to the ordinary definition of the true interval of 
orthogonality, i.e., the interval in which the associated weight function must be considered. 

23 See also J. Shohat, The relation of the classical orthogonal polynomials to the polynomials 
of Appell, Amer. Jour. Math., vol. 58(1936), pp. 453-464; Theorem III. This paper ap- 
peared after the above results were obtained. 














al of 
ered. 
nials 
r ap- 








EXPRESSIONS FOR ORTHOGONAL POLYNOMIALS 353 


Then combining (41), (56), (60), and (61), we have 
(62) 1 < lim aun. S$ 1+’, 


no 


IA 


— lim a, < X’. 


n-?o 


(63) 0 


But lim z,, — lim z;, = b — a; and adding (62) and (63), we have 


no n-?o 


lL < lim aa, — lim a, S 1 + 2n'. 


n--o n--o ~~ 
In a sense this also holds for the case that {A,} is unbounded. 


TueorEM XV. If for a set of OP {d,} is unbounded, then the interval of 
orthogonality corresponding to any sequence {#,(x)} of S is infinite. 


Proof. In (41) let c- = min (ce; , c;_1), then 
Tan > Cr +X! (¢ = 2,3, ---,n). 


Hence, if {c, + d}} is unbounded the theorem is proved. If, however, {c, + rb} 
is bounded, then the negative elements in {c,} must yield an unbounded se- 
quence. But by (39), min < c (r = 1, 2,3,---). Hence, 2, — —, and 
the theorem is also true in this case. 


CoroLtiary. If for any sequence {®,(x)} of OP {X,} is unbounded, whether 
the set {cn} is bounded or not, the interval of orthogonality of {,(x)} is infinite.” 


From Theorems XIII and XIV it will be noted that very little can be said 
of the interval of orthogonality simply from the condition that {\,,} be bounded. 
However, to satisfy the conditions of both these theorems it was necessary to 
introduce sequences {c,} for which c, did not approach a limit. If c, and A, 
approach limits with lim A, = A, Theorem X shows that the interval in which 


the zeros of {#,(x)} can be everywhere dense has a length 4A’. In particular, 
if every c; = 0, that is, for symmetric polynomials, the interval of density is 


(—2a', 2n'). 
Blumenthal” has shown that if 4, ~ A > 0 and c, — c, then the interval $ 
within which the zeros of {#,(x)} are everywhere dense is $¥ = (c — 2x}, 


¢ + 2y*); hence, in any set S of OP for which {A,} has a unique limit point 
\ > 0, the interval in which the zeros are everywhere dense is an invariant for 
all sequences of S for which {c,} has the same unique limit point c. 


4. The associated polynomials and continued fractions. It is known that 
the persymmetric polynomials {®,(z)} are the denominators of the successive 
convergents Q,(x)/®,(xz) of the continued fraction 

a | he | »s | 


(64 = — a 
ae Z—( |t—G |r—Cs 


* Tbid., footnote 23. 
*(. Blumenthal, loc. cit. 








354 VIVIAN EBERLE SPENCER 


and that the numerators {Q,(x)} satisfy the recurrence relation 
2, (x) = (x = Cr)Qn (2) — An Qn—2(r) 


(65) 

(n = 2,3, --- ; Q(x) = 0, A(z) = A), 
which, except for initial conditions, is the same as (7). Considering the se- 
quence {Q,(r)/Ai}, we see that the sets S arrange themselves in conjugate 
pairs S,; and S; such that if {#,(x)} is any sequence of S,, the corresponding 
sequence {@,(x)/A,} lies in S:. 

Restricting ourselves now to OP, let us investigate the finite continued frac- 
tion associated with the polynomials {6,(z, n)}i. The recurrence relation 
(37) shows that the 6,(z, n) are denominators of successive convergents of the 


continued fraction 


K,(0(n), z) = ————) Ba 


(66) 
As | Ae | _ Xn(z, n) 


|xr—Ce |x—c @,(2,n) 
Moreover, from the theory of continued fractions, 


&,_(z) -_ 1 a An | man ) | Nal Xs de | 


(67) = . . 
#, (x) |t — Cp L — Cy-1 f= Cy-9 lr—a@ |r—a@q 


Hence, x,(z, n) = ®n-i(2). 
~ 26 
Sherman” has shown that 


, | n n 
K.(6,z) = . | wp Patz). 
1X — Ce | %— Cz | 2 — Cn+1 Q(z) 
wes 1 | re | h. | _ &,(z) 
K, ®@ = : — : ‘ ee athe aig contac me | = = = 
(®, 2) ltz—aq, |t—@& lzr—c, (2) 
imply Qysi(z) = AQ,(x). Moreover, it is known that the zeros of @,(z) 


separate those of ®,(2). Hence, the zeros of the denominator of the /-th con- 
vergent of K,,_;(@(n), x) separate the zeros of the denominator of the (J + 1)-th 
convergent of K,(@(n), x), where l = 1,2,---,n —1. But K,_,(O(n), z) = 
\,Kn-i(O(m — 1), 7). Hence, the zeros of 0,_,(z, n) are separated by the zeros 
of @,-1-.(z, n — 1). Proceeding in like manner from @,-:~:(z, n — 1) to 
O,-2-:(z, n — 2), ete., we have 


THeorEM XVI. In the sequence of n — it polynomials 
6,—:i(2, n), Oni_i(2, nm — 1), On-e-s(z, n — 2), --- , O(z, 2 + 1) 


(69) : 
( =0,1,---,n—]) 


26 J. Sherman, On the numerators of the convergents of the Stieltjes continued fractions, 
Trans. Amer. Math. Soc., vol. 35(1933), pp. 64-87; p. 67. 








( 
t 
t 
} 





di), 


se- 
zate 
ling 


rac- 


tion 
the 


10Nns, 








EXPRESSIONS FOR ORTHOGONAL POLYNOMIALS 355 


the zeros of each polynomial are separated by those of the polynomial nezt suc- 
ceeding it. 

This is a generalization of a known property of OP for the casei = 0. Theo- 
rem XVI may also be obtained as a direct consequence of the type of argument 
employed above in deriving (39) and (40), for the polynomials (69) satisfy the 
recurrence relation 


(70) 6,_i(27, n) = (4 — Cn) On-i-1(2, nN — 1) — A,On-i-2(2, n — 2). 


This shows further that polynomials (69) are denominators of the successive 
convergents of the continued fraction 


- Aint Nive | An 
(71) - _ — tte , 
XL — Ci+1 | © — Cite jt — Cy 
In particular, for i = 1, (71) becomes K‘._:(z). Hence, in this case (69) may 


be written as {Q,(r)/Ai}i. That is, for 7 = 1, sequence (69), extended by 
letting Q) = dX, , differs by a constant factor from the sequence of numerators of the 
first n convergents of K(x). 

A sufficient condition that all sequences of associated polynomials (69) corre- 
sponding to a given sequence of OP | #,(z)} be equivalent, in the sense that any 


two of their polynomials of the same degree be identical, is that {e(x)}i = 


(o,(x)}} , fork = 1, 2,--- ,, and hence, that ¢; = c, (i = 2,3, ---,n) and 
hi = Ao (¢ = 3, 4,---,m). In particular, for the symmetric case (c; = 0), 
with A; = 3, as noted above in §2, this requires that 


. ‘ hi 27 
|, (2)} 12°" (sin (n — 1) are cos x)/sin are cos 7}. 
For i = 2 sequence (69) is, to a constant factor, the sequence of numerators 
° . . yl . . . 
of the successive convergents of the continued fraction K,(z) in (68) in which 
the denominators of the corresponding convergents give, to a constant factor, 
the sequence {2,,(x)}. Proceeding by induction, we obtain the following inter- 
pretation for (69): 

THroreM XVII. If constant factors are disregarded, the polynomials (69) for 
i=k,k — 1 (k = 2,3,--- ,n) are respectively numerators and denominators of 
successive convergents of continued fractions of type (64). 

Let us now return to the sequence {@,(z, n)}o . It bears an interesting rela- 
tion to the “associated” continued fraction K(x) in (64). The numerators of 
its successive convergents are given by the relation 


(72) Q(x) = | = = =) dy(y), 


where ¥(y) is a bounded, monotone, non-decreasing function in (— “, ©) such 
that 


(73) { =. ae ne 


or—U |r — Cy r— Co — Cy 


*7 See also J. Sherman, ibid., pp. 80-81. 








356 VIVIAN EBERLE SPENCER 
Sherman” found 


n —_ @,, P 
Pe) 0) L(y) Peat) + Ln aly) ®sa@) +o, 


| 


(74) Lann-aly) = 1, Laney) = y — er, 


| 3(y) (y = Cn—1) Ln, n—2(y) = pe er 


Hence, 


(75) | 1(r) = Oo(2, n), ~ oa , es i(x) = 0; (2, n), a ee Ln.o(2) = 0, (zx, n). 


5. An extension of Krein’s theorem. From definition (6) of c,, and X,, itis 
clear that if the {c,}7"' and {A,;}3 remain fixed while c,, increases, then the 
induced behavior of {a;} will be: the {oes }3” * remain fixed and aem—1 increases, 
Similarly, if the te:}7 ' and {A;}3"' remain fixed while X,, increases, the induced 
behavior is: the {a;}?""* remain fixed and aem—2 increases. An increase in 
C2m—1 With the {a;}3""' fixed, however, induces changes in the {c,;}".; and 
‘A,}n41. Instead, let the {a;}3,' vary with a2,-,. Changes in each {a;}3." 
may be uniquely determined from (6), as functions of az,_;, such that the 
fecstnu, and {Ay}R41 remain fixed. A similar statement holds for changes in 
G2m-2. Defining as admissible variations changes in {a;}{"' such that {a;}i" 
is fixed, a, increases, and the {a;}777" vary in such a way that {e;}; and {Aj}? 
(h = 1 = 3(k + 3), nodd;h = 3(k + 2), 1 = 3(k + 4), n even) remain fixed, 


mn ea + 29 . . 
Theorem II of Krein” may be stated in the following manner: 
r r , , . . . . . 2n-1 
THeoremM XIX. For any OP and admissible variations in the set {a;}i": 


(76) fe > 9 (§ = 1,2,--+,9:k = 1,3,---, 2 —B 
Oa, 
(77) In the sequence OFin oF en ove, OXin (¢ = 1, 2, ---,n) 
Ban” Gay, Oa2n—2 


i — 1 terms have the sign + and n — i terms the sign —, if we affix the suitable 
signs to possible zero terms. 


UNIVERSITY OF PENNSYLVANIA. 


28 J. Sherman, ibid., p. 82. 
29 M. Krein, loc. cit. 




















THE ALGEBRA OF LATTICE FUNCTIONS 


By MorGcan Warp 


I. Introduction 


1. The numerous disconnected results on numerical functions (that is, func- 
tions on the positive integers to the complex numbers) which are summarized 
in the first volume of Dickson’s History have been welded into a simple and 
coherent theory by Bell in a series of papers culminating in his Algebraic Arith- 
metic (Bell {1]'). Bell has shown in detail (see, for example, Bell [2], [3], [4], 
(5|, [6]) that all the various inversion formulas, factorability properties, numer- 
ical integrations, and so on, of these functions follow from three basic facts. 

I. The set of all numerical functions form a ring with respect to the operations 
of addition and Dirichlet multiplication. 

The sum o = ¢ + y of two numerical functions ¢ and y is defined by o(n) 
= ¢(n) + ¥(n), while their Dirichlet product + = ¢y is defined by 
(1.1) r(n) = p> o(d)y(8). 

Il. The set of all numerical functions @ such that o(1) # 0 form a group with 
respect to Dirichlet multiplication. 

The inverse @ ' of ¢ satisfies 


: @ 2 wg: 
(1.2) > odo (5) = { , 
dé=n (0 otherwise. 
For example, the inverse of the function ¢ defined by ¢(n) = 1 for all n is the 


Mobius function y(n). 

III. The set of all factorable functions is closed with respect to the operation of 
Dirichlet multiplication. 

A function y is said to be factorable if 


(1.3) ¥(mn) = ¥(m)p(n) if m, m are co-prime. 


It may be shown that the factorable functions form a group with respect to 
Dirichlet multiplication, on excluding the trivial function » vanishing for all 
integers n. 

Since the positive integers form a semi-ordered set with respect to the 
relation x divides y, and indeed a lattice, it is natural to ask whether results of 
like simplicity and generality hold for functions on semi-ordered sets and lat- 
tices. But since both Dirichlet multiplication and factorability depend upon a 


Received December 27, 1938. 
‘Numbers in brackets refer to the references at the end of the paper. 


357 








358 MORGAN WARD 


multiplicative property of the integers, our way is apparently blocked by the 
impossibility of introducing a multiplication into an arbitrary lattice.” 

This difficulty is surmountable by passing to Ore’s ‘‘quotient structures” 
(Ore [1]) and the analogous quotient sets of a partially ordered set. For these 
systems, a multiplication naturally presents itself (Ore [1], p. 426) which enables 
us to define a “Dirichlet product” of two partially ordered quotient sets, and 
thereby generalize properties I and II to semi-ordered sets. Our results include 
Weisner’s (Weisner [1]) remarkable Mébius function and his associated inversion 
formulas for lattices and semi-ordered sets. 

For factorable functions over lattices, another type of generalization is pos- 
sible. For excluding the trivial function zero, we easily see from the funda- 
mental theorem of arithmetic that f(x) is factorable if and only if 


(1.4) S(m)f(n) = f({m, n])f((m, n)), 
(1.41) fi) = 1. 


Here (m, n) and [m, n] denote as usual the greatest common divisor and least 
common multiple of m and n. Condition (1.4) may be immediately extended 
to lattices, as the implied multiplications occur in the range of the dependent 
variable. 

We are thus enabled to unify results of Dedekind [1], Birkhoff [1], [2], Glivenko 
[1], [2] on norms, ranks and distances defined over lattices. In particular, we 
show that Dedekind’s module symbol (a, 6) (Dedekind [1], pp. 267-271; Dede- 
kind [2]) and the distance function introduced by Glivenko are definable in 
terms of one another. 

An arithmetical function f(z) is said to be multiplicative if 


(1.5) f(mn) = f(m)f(n) for all integers m, n. 


By extending this definition to the multiplication of quotient sets defined in $4, 
we show how we may pass from functions factorable over a set to functions 
factorable over the quotient set. But the closure property III of factorable 
functions with respect to Dirichlet multiplication is lost, since we prove that it 
implies that the lattice must be distributive. Our proof rests upon a useful 
result in pure lattice theory. 


2. An element a of a semi-ordered set or a lattice is said to cover another ele- 
ment ¢ of the set (Birkhoff [1]) if a # canda Dz Dc impliesa = zorc=a. A 
subset of a semi-ordered set is said to be “complete” if for any two elements 
a and b of the subset, a covers b in the subset if and only if a covers b in the con- 
taining set. 

? The properties of lattices over which a multiplication exists have been studied by 
Ward and R. P. Dilworth in some detail (Ward [1], [2]; Dilworth [1], [2], [3]; Ward-Dil- 
worth [1], [2]). But for such simple lattices as the modular and non-modular lattices of 
order five and Dedekind’s free modular lattice of order twenty-eight, no multiplication 


is definable. 








Vy the 


tures” 

these 
nables 
s, and 
iclude 
ersion 


S pos- 
unda- 


least 
anded 
ndent 


venko 
ir, we 
Dede- 
le in 


mM, N. 


in $4, 
tions 
rable 
hat it 
iseful 


r ele- 
r. A 
nents 
: cOn- 


ed by 
d-Dil- 
ces of 
-ation 











ALGEBRA OF LATTICE FUNCTIONS 359 


3. We prove 


THEOREM 3.1. Let S be a lattice such that in every quotient lattice a/b in which 
a # b, there exists an element c covered bya. Then S is a modular non-distributive 
lattice if and only if S contains a complete modular sublattice of order five. 


The plan of the paper is sufficiently indicated by the chapter titles. We 
assume that the reader is familiar with the first part of Ore’s fundamental 
memoir On the foundation of abstract algebra (Ore [1]) and also with our previous 
paper in this Journal (Ward [1]) upon multiplication and residuation in strue- 
tures. We use the notation and terminology of the latter paper with the 
substitution of the term “lattice” for the term “structure”’. 

It is a pleasure to acknowledge my indebtedness to many stimulating dis- 
cussions with Professor E. T. Bell, who first called my attention to the im- 
portant work of Weisner [1], [2]. 


II. The ring of functions on semi-ordered seis 


4. Let S be a semi-ordered set of elements a, b, --- with respect to a well- 
defined ordering relation x > y, and define equality in S as usual by x = y 
if and only ifr D> yandy Dx. Unequal elements will be called distinct. If 
u¥vandifu D2 Dvimplies u = z orv = 2g, we say u covers v, writing u > », 
v<u. If Sis ordered, we call S a chain. 

Given any two elements u, v of S such that u > v, the class ¥ of all elements x 
such that u D x D v forms a semi-ordered set which we call the quotient of 
vby u. We write X = u/v, the restriction u > v being understood. We make 
the totality of all quotients X of S into a semi-ordered set © by defining ¥ > Y 
as follows: If ¥ = u/vand ¥) = z/w, then ¥ > Y) if and only if u Dzand v D w. 
With this ordering, two quotients ¥ and 9) are equal if and only if the classes 
¥ and ¥) are equal in the set-theoretic sense. The quotients u/u form a par- 
tially ordered set isomorphic with S. We call any such quotient a unit. 

Given two quotients ¥ = u/v and 9) = z/w such that v = z, we define their 
product to be the quotient B = u/w, writing 


GQ = X-9, or u/w = u/v-v/w. 


If » # z, no product is defined. This multiplication is associative but non- 
commutative save in the trivial case ¥ = JY) = a unit. 

If S is a lattice, our concepts reduce to the quotient structures introduced 
by Ore. 


5. Now let I be an arbitrarily chosen division algebra of characteristic zero, 
and consider the totality of all well-defined one-valued functions ¢@ on > to TP. 
If ¥ = u/v, we write 


ox _ dur 


for the value of ¢ in T which corresponds to &. 








360 MORGAN WARD 


We shall assume from now on 
Pl. The number of distinct elements x in every quotient X of X ts finite. 

In particular then, a unit quotient contains precisely one distinct element. 

Two set functions ¢ and y will be said to be equal if and only if their values 
o., and yy» are equal in I for all ¥ = u/vof S. We write as usual @ = y. 

We introduce an addition and multiplication for set functions as follows. 
The sum o = ¢ + y of two set functions ¢ and y is defined by 


aX = ok + Yi, Cu = Duv + Wu . 
The Dirichlet product + = @y of ¢ and y in that order is defined by 
(5.1) rE= DO OW, rw = DL bude. 


UBV=¥ ude 
Here the first summation is extended over all distinct pairs of quotients 
U and ¥ whose product is X, in strict analogy with the Dirichlet multiplication 
(1.1). In the second summation, the product is taken over all distinct elements 
zx of the quotient set u/v. On occasion, we write t.. = @uz ¥zv , the summation 
over u/v being indicated by the repeated index x. PI insures that z is a set 
function. 
The following analogue to I of §1 follows immediately from the definitions. 


TuHeorEM 5.1. The totality of all set functions @ on ¥ to T form a ring. 





The ring has a unit element 6 defined by 
jue = 1 if u=v; b.. = 0 otherwise, 
with the characteristic property 6) = yi = y. We denote this ring by XR. 


6. A function ¢ of the ring ® is called proper if and only if it has an inverse 
¢@ ' with respect to multiplication, so that 





(6.1) o¢'=5 OF durbe = bw. 
THEOREM 6.1. A function ¢ is proper if and only if 
(6.2) du. ¥ 0 for every unit quotient u/u of =. | 


Proof. Condition (6.2) is necessary. For if, for some element a of S, daa = 9, 
% (6.1) gives for u = v = a, 0-¢2 = 1, and this is impossible in a division algebra. 
Condition (6.2) is sufficient. For if it is satisfied, then if we put u = v in (6.1), 
¢.. = 1/¢. forall u. Thus¢“% is defined for all unit quotients u/u. Assume 
that ¢ ‘¥ is known for all quotients containing fewer than k distinct elements 
(k = 2), and let A = a/b be any quotient containing exactly & distinct elements. 
Then putting u = a, v = b in (6.1) and (6.2), we find that oA = —1/bes 
phy ¢oroz , the prime indicating that the term with z = a is to be omitted from 
the sum. Each quotient 2/b contains at most k — 1 distinct elements. Hence 
all the values of ¢@ ' in the summation are known so that ¢ '% is determined. 
Hence by induction, @ ‘¥ is known for every X of >. 

















ALGEBRA OF LATTICE FUNCTIONS 361 


Since 6 is proper and (¥)uu = PuuPuu, While T contains no divisors of zero, 
we have immediately the following analogue of II in §1. 


THEOREM 6.2. The set of all proper functions on = to T forms a group with 
respect to Dirichlet multiplication. 


7. As illustrations of set functions, consider the following list of functions 
for any semi-ordered set satisfying condition P1. 


TABLE OF SPEcIAL FUNCTIONS 


NAME OR SYMBOL VALUE OF PROPERTY 
(i) zero, w Wu = 0 
0, u ¥ », 
(ii) one, 6 bu» = 
luwu=v 
(iii) f Suv = 1 
(0, u =», 
(iv) @=¢-6 Ou» = 
luyxv 
(v) Mébius function » = ¢' Muster = Cusiter = Suv 
(vi) 2 ¢2, is the number of distinct elements 
. in the quotient u/v 
* ; : {1,u = voru >», 
(vii) covering function « Ku = 
lo, otherwise 
(viii) Laws -8 Au» is the number of distinct elements 
4 in u/v covered by u, and »,» is the 
~(ix) v = (x — 6) number covering v 


The functions 6, ¢, » and « are all proper. It is easily shown that if S is a 
chain, then 


uw = (-1)*, A=v=6= 3(u — x). 


The characteristic property of the Mébius function is expressed by the re- 
lations 


@ = Sv (vs) implies yp = ud (yu) 
or more completely (Weisner [1]) as follows: 


If &. = } Yr», then Yur = ) MuzPzv- 


udzv udzv 


bo Wuz, then Yur - > Puzbezv- 


udzmv udidv 


If our 











362 MORGAN WARD 


If we take for S the integers 1, 2, 3, --- , for x > y the division relation z 
divides y, and for T the field of complex numbers, and if for any quotient ¥ = 
u/v we always choose u = 1, then on writing v for ¥, u(n) is the Mobius function, 
¢’(n) is the number of divisors of n, and \(n) is the number of distinct prime 
factors of n. Another numerical function of importance is the total number of 
prime factors of n, p(n). For S a modular lattice with a unit, the generalized 
function p,, is the length of any principal chain joining u and v. 


8. If S contains only a finite number of distinct elements u, us, --- , ux, 
the ring ® may be represented as a matric algebra of order k over l’. For con- 
sider the totality of k-rowed square matrices @ = (¢;;) over I with the property 
that ¢;; = Oifu;u;. Wecan correlate each such with the function ¢ whose 
values are ¢u,u; = ¢i;. We write d — ¢. 

If ’—@ and VW — y, then clearly @ = W if and only if @ = y, and®@+ vu 
@ + y. The correspondence also preserves multiplication, for 


k 
(OV); - > diz Pzi- 


Every term in this sum in which we do not have both u; > u, and u, D u; 
vanishes. Hence by (5.1), 


(OY) usu; if u> uj, 


(@v);; = 
otherwise. 

This correspondence extends to the group of proper functions, to each of 
which corresponds a non-singular matrix. In particular, since 6 corresponds 
to the unit matrix (4;;), we have a method for calculating the inverse of any 
proper function ¢ by calculating the reciprocal of its matrix ®. 

Consider, for example, the Mébius function over the quotient structure of 





the modular lattice of order five. If we write for simplicity 7 for u; (¢ = 1, --- , 5) 
| 
2a P 
5 
Fic. 1 


and designate the elements as in Figure 1, the matrices Z and M = Z‘ cor- 


responding to the functions ¢ and » = {' are 











co 
da 


by 
col 





| Us 


‘OF- 














ALGEBRA OF LATTICE FUNCTIONS 363 


fl tri a (1 ot —t--¢ 
‘0 100 1) 0 1 0 0 0 
Z='!0 01 0 lj, M=|0 0 1 O 0 
00011 00 0 1 0 
l0 0 0 0 1) lo 0 oO 90 J 
Thus wn = 1, wz = —1, --- , ws = 2, and so on.* 


III. Factorable functions and norms 


9. From now on we assume that the semi-ordered set S is a lattice and con- 
fine ourselves at first to functions on © to an Abelian group A. For the time 
being we do not assume postulate P1. 

A function ¢ on © to A is said to be factorable if 


Nl. a=b in S implies ga = ob in A; 
N2. gadb = (a, b)¢la, b] for all pairs of elements a, b of S.* 


If we assume that a commutative multiplication zy is definable over the 
lattice with the properties given in Ward [1], and also assume that 


(9.1) ge = l, e the unit of S, 1 the identity element of A, 
then since [a, b] = ab if (a, b) = e (Ward [1]), we have 
(9.2) gab = gadb if (a,b) =e. 


It therefore seems appropriate to call all functions @¢ satisfying N1, N2 “‘fac- 
torable” whether or not a multiplication is definable over S. 


10. Factorable functions are of frequent occurrence in lattice theory. For 
example, the following functions are always factorable. 

(i) The function ¢ defined by fa = 1 for allaof S. 

(ii) The rank function of Dedekind [1], Birkhoff [1]. 

(iii) The dimension function of von Neumann [1]. 

(iv) The norm function of Glivenko. 

The last three functions may only be introduced in a modular lattice. 

(v) Any evaluation in a residuated lattice is factorable (Ward-Dilworth [3)). 

(vi) Let S be an Archimedian residuated lattice of order = 2. Then © 
contains divisor-free elements, and each element a of S has only a finite number 
ha of divisor-free divisors. Az is (additively) factorable over S. 


* By applying the ideas developed in Bell’s Algebraic Arithmetic, a similar representation 
by infinite matrices may be given for any denumerable semi-ordered set as no questions of 
convergence are involved. 

‘ The group A may be written additively if preferred. 








364 MORGAN WARD 


(vii) The ordinary product ¢y of two factorable functions defined by (¢y)z 
= gxyzr is factorable. (The Dirichlet product of factorable functions over a 
quotient lattice need not be factorable.) 

(viii) If S’ is a sublattice of S, every function factorable over S is factorable 
over ©’. 

(ix) If Sis a chain, every function satisfying N1 also satisfies N2 and is hence 
factorable. 

(x) Let S be a residuated lattice which is the direct product (Ward-Dilworth 
[2]) of lattices S.. Then a function ¢ factorable over S defines functions ¢, 
factorable over S,.. Conversely, the ordinary product of functions @¢q fac- 
torable over S, gives a function factorable over S. The instance of common 
arithmetic occurs when © is the direct product of chains. It is essential that 
S be residuated. 


11. We have the following fundamental lemma: 
Lemma 11.1. If ¢ ts factorable over S and a > b, then for any c of S 
ola, (b, c)] = (0, [a, c}). 


For if a > b, N2, N1 give gagpbge = ga¢g(b, c)d[b, c] = g(a, (, c))¢fa, (6, ¢)] 
|b, c] = (a, c)glb, clea, (b, c)]; papbde = g(a, c)dla, c]ob = g(a, c)¢([a, c], b) 
dla, c], b] = (a, c)o[b, c]p(b, [a, c]). Since A is a group, the result follows. 


DEFINITION OF ANORM. A factorable function ox on S to an Abelian group A 
is said to be a norm if and only if 


N3. a Dbin Sand ga = gb inA imply a = bin GS. 
We denote a norm function by Nz. Lemma 11.1 gives immediately 
THEOREM 11.1. Jf a norm Nz is definable over S, then S is a modular lattice. 


For a > b implies [a, (6, c)] = (0, (a, c]). 
Since we have trivially [(a, b), (a, c)] > (a, [b, c]) and [a, (b, c)] > ({a, 6}, [a, c}), 
N3 gives 


THEOREM 11.2. If a norm Nr is definable over S, then S is distributive if and 
only if 
(11.1) N(a, [b, c]) = N[(a, b), (a,c)] and Nf{a, (b, c)} = N({a, b], [a, c}) 
for every set of three elements a, b, c of S. 

The following theorem is also a consequence of N3. 


THEOREM 11.3. If Nz is a norm over S, then a D b if and only if Na = N(a, b) 
and Nb = N{a, b}. 








12a 





ble 














ALGEBRA OF LATTICE FUNCTIONS 365 


IV. Modular functions and distance functions 


12. Let Nx be a norm on © satisfying conditions N1, N2 and N3._ The func- 
tions Mry and Dry defined by 


Ny N[z, y] 
Mzy = » Dry = 
(12.1) fry N(z, y) ~~ N(z, y) 


are called the modular function and distance function associated with the norm 
Nx. They are connected by the formulas 


(12.2) Mary = D(x, y)y; Day = MryMyz. 
The modular function and the distance function have the following properties: 


Ml. a = bimplies Mac = Mbc and Mca = Meob, any c. 
M2. a > bif and only if Mba = 1. 
M3. M(a, b)b = Mab. 
M4. Mala, b] = Mab. 
M5. Ifa Db De, then Mac = MabMbc. 
Dl. a = bimplies Dac = Dbe and Dea = Deb, any c. 
D2. a = bif and only if Dab = 1. 
D3. Dab = Dba. 
D4. D(a, b)b = Dafa, b). 
Ds. Ifa Db De, then Dac = DabDbc. 
These properties are all simple consequences of the properties of a norm. 
For example, consider M5 and D4. Since a De, (a,c) = a. Hence N(a, c) 


=Naby Nl. Therefore by (12.1) Mac = we = ae. But since a Db De, 
Na NaNb 


a = (a, b) and b = (b,c). Hence by N1, Na = N(a, b) and Nb = N(b, c). 


Therefore by (12.1), Mac = NOB) Ves = MabMbc, and thisis M5. For D4, 


we have [(a, b), b] = 6, ((a, 6), b) = (a,b). Hence by (12.1) and N1, D(a, b)b 


Nb N{a, b] P 
= by N2. , 0], aj= [a, 0) é , 5, = a. 
Nia,b) ~ Na °Y N2- But [la, b), a]= [a, b) and (a, b],a) = a. Hence 
N{a, b} 


by NI and (12.1), Dafa, 6] = Na 


Let S be a lattice with a unit element e. Then if a modular function Mry 
is defined on SS to A with Properties M1-M5, it is easy to show that 


, or D(a, b)b = Data, 6}. 


(12.3) Nx = Mex 


* Properties M2-M5 are given in Dedekind [1]. Properties D2 and D3 are a general- 
ization of the first two distance axioms of Fréchet and Hausdorff. 








366 MORGAN WARD 


isa norm. Similarly, if a distance function Dry is defined on SS to A with 
Properties D1-—D5, then 

(12.4) Nx = Dex 

isa norm. Hence we have 


THEOREM 12.1. Jn a lattice with a unit element, a norm, a distance function, 
and a modular function are equivalent concepts, each definable in terms of the other. 


The following theorem is also immediate. 


THEOREM 12.2. Let & be a lattice with a norm Nx and associated modular 
functions and distance functions Mry and Dry. Then a D b if and only if Mab 
= Dab. 


13. In many instances, the group A consists of the integers or the real numbers 
under addition, and the value of the norm is positive or zero. This occurs, 
for example, in the instances (ii)-(vi) cited in $10. We are thus led to the 
following additional restrictions on a norm. 

A set A of elements of the group A is said to be an “integral set” if (i) it is 
closed under group multiplication; (ii) it contains the group identity 1; (iii) 
for at least one element a of A, a‘ is not in A. We may partially order A and 
hence A by the division relation x | y where z | y if and only if yx" lies in A. 
(For the case when A is the set of positive integers and the group operation is 
addition, a | b if and only if a Ss b.) 

A norm Nz on © to A will be said to be integral if and only if 


N4. All the values of Nx lie in a set A of integral elements of A. 
N5. a Db in S implies Na | Nb in A. 


There exist factorable functions satisfying N1, N2, N4 and N5, but not N3; 
the simplest example is an evaluation (Ward-Dilworth [3]). 

Formulas (12.1) show us that we then have the following conditions on the 
associated modular function and distance function: 

M6-D6. AU the values of Mry (Dry) lie in A. 

But conversely the norm associated with a modular function (distance fune- 
tion) with Properties M1-M6 (D1-D6) will have Properties N4 and N5. The 
truth of N4 is obvious from (12.3), (12.4). Consider N5. If a > b in G, then 
Mba = 1. Hence since e D a Db, Meb = MeaMab or Nb = NaMab. Since 
Mab lies in A, Na| Nb in A. The proof for the distance function is similar. 
Let us call a modular function (distance function) satisfying M1-M6 (D1-D6) 
‘integral’. Then Theorem 12.1 becomes 


THeoreM 13.1. Jn a lattice with a unit element, an integral norm, integral 
distance function and integral modular function are equivalent concepts, each 
definable in terms of the other. 


We readily find that for any a, b and c 








ith 


on, 


her. 


ilar 


fab 


eTS 
urs, 
the 


t is 
(iii) 
and 
1 A. 
n is 


N3; 


the 


ine- 
lhe 
hen 
nce 
lar. 
D6) 


gral 
ach 











ALGEBRA OF LATTICE FUNCTIONS 367 
Nib, (a, c)] 
N{(a, b),(b,c)]’ 


~ NI(a, ¢), bIN (la, 61, [b, el) 
N(b, [a, c])N[(a, 6), (6, c)} 


Since [(a, b), (b, c)] > [b, (a, c)] and (6, [a, c]) > [(a, c), b], [(a, b), (b, c)] D 
({a, b), [b, c]), it follows from N5 that 


M7. Mac | MabMbc, 
Di. Dac | DabDbc. 


MabMbe = Mac 


DabDbc = Da 


D7 reduces to the familiar triangle inequality for the distance function when 
Ais the set of real numbers 2 0 and the group operation is addition. M7 is an 
analogous ‘‘triangle inequality” for the modular function. 


\. Factorable functions and multiplicative functions over quotient lattices 


14. Let = be the quotient lattice of S. A function ¢¥ on = to A is said to be 
“factorable”’ if it satisfies N1 and N2, and “multiplicative” if 


(14.1) oX- 2) = key). 


Here, as in §4, the product X-¥ of the quotient lattices ¥ = u/v and Y) = v/w 
is the lattice u/w. 

A factorable function ¢X¥ on = to A defines a factorable function on S to A; 
for we may define ¢z, re S to mean ¢,,, z/x a unit quotient of 2. We may 
also pass from a factorable function ¢ on © to a factorable function @ on &. 


For given any quotient % = a/b, we define ¢% to mean .. x is obviously 
a 


factorable. For if & = a/b, B = c/d, (A, B) and (A, B] are defined as (a, c)/ 
(b, d) and [a, c]/[b, d] (Ore [1]). We call ¢¥ the “extension” of ¢ over .° 
ox is multiplicative. For if A-B = C, where A = a,/az, B = bi/be, C = 


(/e2, then since a; = ¢,;, @2 = b,, be = cz, we have by N1 ¢€ = C2 = obs 
oC; pa, 
ob: ode J 
=— — = oAod 
ob; oa; - 


Conversely, if S contains a unit element e and ¢% is both factorable and 
multiplicative over 2, @ is the extension of a factorable function on S. For 
given any x in S, we define gz to be ge/r = ¢-. Then¢ is evidently factorable 
over S. Since ¢ is multiplicative over L, ¢&. = Pevbur. Hence if ¥ = u/v, 


a.2 
ou 


‘A simple example is given by the rank function p of a modular lattice of finite length 
(Birkhoff [1]). The extension p¥, ¥ = u/v is then the length of any principal chain joining 
u and v, 








368 MORGAN WARD 


THEOREM 14.1. Let S& be a lattice with a unit element. Then a factorable 
function ¢¥ on the quotient lattice = of S to an Abelian group A is multiplicative 
over > if and only if it is the extension of a factorable function on & to A. 


The extension of a norm need not be a norm. For let Nx be a norm on &, 
and NX its extension over =. Let b and c be any two elements of S such that 


bpec. Then (b,c) + b, [b,c] # c. However, NED = a by N2. Thus 


for the two quotients & = (b, c)/b, B = c/[b, c], we have A DB, NA = NY, 
but &% =~ B in contradiction to N3. 


VI. Factorable functions and Dirichlet multiplication 


15. Let YX denote from now on a function on the quotient lattice = of a lattice 
S satisfying P1 of §5 to the division algebra I’, and consider the totality of such 
functions which satisfy the postulates N1 and N2 for factorable functions. 


TueoreM 15.1. If the set of factorable functions y on = to T is closed under 
Dirichlet multiplication, the lattice 2 must be distributive. 


Proof. We observe first that it suffices to show that the basic lattice is dis- 
tributive (Ore [1]). Now the function ¢ which always equals 1 is obviously 
factorable. Assume that ¢’ is also factorable. If ¥ = u/v, ¢°% is the number of 
distinct elements in the lattice ¥. By hypothesis then 


(15.1) PAPB = PCM, BM, B) 


for every pair of quotient lattices A, B of >. 

If S is non-modular, S contains a non-modular sublattice of order five. On 
lettering its elements as in Figure 2, we have a = (b, c) = (b, d) and e = [b,¢] 
= [b, d). 


a 
/ 
b 
d 
e 
Fic. 2 


Consider the three quotient lattices a/b, a/e and a/d. We have then (a/b, 
a/c) = (a/b, a/d) = a/a and [a/b, a/c] = [a/c, a/d] = a/e. We deduce then 
from (15.1) that 

Sate. = Seebee,  faates = Seaher- 


2 2 a ; aes 
Hence . = fa, and this is impossible. Hence © is modular. 
If S is modular but not distributive, then by Theorem 3.1 (see §16), S cot 








cor 


ap 








On 
b, ¢| 


(a/b, 
then 


con- 








ALGEBRA OF LATTICE FUNCTIONS 369 


tains a complete modular sublattice of order five. Letter its elements as in 
Figure 1, with 12345 replaced by abcde, respectively. Then we deduce that 
tates = Saakee - 

But since the sublattice is complete, A = , =2¢, =1, ro > 5. Hence 
S must be distributive. 


16. It remains to prove Theorem 3.1. We may remark that the point of this 
theorem is that the sublattice is complete; Birkhoff has proved that in any finite 
modular non-distributive lattice there exists a modular sublattice of order five 
(Birkhoff [2]). The covering hypothesis of the theorem will be satisfied if the 
weak ascending chain axiom (Ore [1], p. 410) holds in S. The theorem may be 
obviously dualized, but it is easily shown by simple examples that no analogous 
result is true for the non-modular lattices of order five contained in a non- 
modular lattice (Dedekind [1]). 

Proof of Theorem 3.1. We first show that & contains at least one modular 
sublattice of order five. This result, which is purely combinatorial,’ rests upon 
the following lemma of Dedekind’s (Dedekind [1], p. 252). 


LemMA 1. © is distributive if and only if for any three elements a, b and c of 
(16.1) [(a, b), (b, ¢), (c, a)} = ({a, 5], [b, ¢], [e, al). 


Now, assume that S is modular but not distributive. Then S must contain 
three elements a, b and c such that 


(16.2) [(a, b), (6, c), (c, a)] # (a, 5}, [b, el, [e, a]). 
Let u = [(a, [b, c]), (b, c)],o = “i [c, a]), (c, a)], w = [(c, [a, b}), (a, b)]. Then 
(u, v) = (v, w) = (w, u) = [(a, b), (6, ¢), (c, a)], 
[u, o] = [v, w] = 7 u] = ({a, b], [b, cl}, [e, a]). 


For consider [u, v]. Since (b, c) > (0, [c, a]) and (c, a) > (a, [b, c]), [u, vo] = 
[(a, [b, c}), (0, [e, a])]. But (a, [b, c]) > [a, c] and b > [b, c]. Hence by the 
modular axiom, [u, v] = ([c, a], [b, (a, [b, c])]) = ([e, a], ([b, c], [a, b])) = (a, 5b), 
[b, c], [c, a]). The remaining equalities in (16.3) follow by symmetry and duality. 

No one of the elements u, v and w can divide any other. If for example u > », 
then by the modular axiom [u, (v, w)] = (v, [u, w]). Hence by (16.3) [u, (u, w)] 
= (v, [v, w]) oruw =v. Then by (16.3) (v, w) = (v, u) = v, sothatw Dv. Hence 
u =v = w, so that (16.3) implies (16.1), contradicting (16.2). The five ele- 
ments {7 = (u,v), u, v, w, t = [u, v]} thus form a modular sublattice of S. 

For the second part of the proof, we need the following lemma (Ore [1], p. 419; 


Birkhoff [1]). 


(16.3) 


Lemma 2. If a and b are any two elements of a modular lattice S, then (a, b) 
covers a if and only if b covers {a, b]. 


*The proof of Birkhoff [2] is indirect and rests upon properties of the rank function 
applicable only if the lattice is of finite length. 








370 MORGAN WARD 


With the notation previously employed, the quotient lattices j/u, j/v, j/w 
and u/t, v/t, w/t are all isomorphic to one another. Hence if u > t, the lattice 
\j, u, v, w, t} is complete. If u > ¢, then by the hypothesis of the theorem, 
there exists an element ssuch thatu >s Dt. Letv’ = (v,s). Then (u, v’) =), 
By the modular axiom, [u, v’] = (s, [v, u]) = (s, 4) = s. Thensince u > [u, v'}, | 
(u, v’) > vo’ by Lemma 2. Thus there exists an element v’ such that 7 > v’ Dy», 
[u, v’'}] = s. Similarly, there exists an element w’ such that 7 > w’ > w, [u, w) 
= s. Let wu’ be any element such that j > u’ Du. (w’ exists by our covering 


hypothesis.) Then clearly (u’, v’) = (v’, w’) = (w’, u’) = j, so that no one of 
u’, v’, w’ divides any other. 
Consider next the three cross-cuts v” = [u’, v’], w’’ = [v’, w’] and [u’, w’], 
’ 


Then if any one divides another, all three are equal. For if, say, v’’ D w", 
then since v’ Dv” and v’ > w”, v” = w”. But then [[u’, v’], w”’] = [u’, v’, w| 
= w”’, so that [u’, v’] D> w”, [u’, v’] = w”. Thus [w’, v’] = [v’, w’] = [w’, wv) 
= t’, say, and {j, wu’, v’, w’, t’} is the desired sublattice. 

Assume finally that neither of v’’, w’’ divides the other. Then (v’’, w’’) = wv’, 
and if we let t” = [v”’, w”’] = [u’, v’, w’], then 


(16.4) u>s = [u, et”, Tac aT, >a” >. 





Let u” = (u, t’’). Then from (16.4) and Lemma 2, u’ D u” > Ut”, so that 
u” # u’. By the modular axiom 


[u’’, v”’] = [(u, t”), [u’, v']] = ([u, [w’, v’]], ) = (s, t%) = 0"; 
(u”’, v’’) = ((u, t”), [u’, v’]) = (u, [u’, v’]) = [u’, (u, v’)] = [v’, J] = w’. 


Similarly [u’’, w’’] = t’’, (u’, w’’) = u’. Hence we have shown that there exists 
au” such that u” > ¢” and 


(u’’, v’’) = (v’’, w’’) = (w’’, u”’ = u’: [u’’, v’’| = [w’’, v’’| = [w’’, u’’| = " 





Consequently {u’, u’’, v’’, w’’, t’’| is the desired sublattice. 
REFERENCES 


E. T. BELL. 

1. Algebraic Arithmetic, New York, 1927. 

2. Bull. Am. Math. Soc., vol. 16(1912), pp. 166-167. 

3. Bull. Am. Math. Soc., vol. 28(1922), pp. 111-122. 

4. Journal Indian Math. Soc., vol. 17(1928), pp. 1-12. 

5. Trans. Am. Math. Soc., vol. 25(1924), pp. 145-154. 

6. Annali di Matematica, (4), vol. 4(1927), pp. 1-6. 
GARRETT BIRKHOFF. 

1. Proc. Camb. Phil. Soc., vol. 29(1933), pp. 441-464. 

2. Proc. Camb. Phil. Soc., vol. 30(1934), pp. 115-122. 
R. P. Ditworrs. 

1. Bull. Am. Math. Soc., vol. 44(1938), pp. 262-268. 

2. Bull. Am. Math. Soc., vol. 44(1938), p. 332, abstract. 

3. Bull. Am. Math. Soc., vol. 44(1938), p. 625, abstract. 











that 











ALGEBRA OF LATTICE FUNCTIONS 


V. GLIVENKO. 
1. Am. Jour. Math., vol. 58(1936), pp. 799-828. 
2. Am. Jour. Math., vol. 59(1937), pp. 941-956. 
J. VON NEUMANN. 
1. Proc. Nat. Acad. Sci., vol. 22(1936), pp. 92-100. 
0. ORE. 
1. Annals of Math., vol. 36(1935), pp. 406-437. 
M. WARD. 
1. This Journal, vol. 3(1937), pp. 627-636. 
2. Annals of Math., vol. 39(1938), pp. 558-568. 
M. Warp AND R. P. Ditworrs. 
1. Proc. Nat. Aead. Sei., vol. 24(1938), pp. 162-164. 
2. Trans. Am. Math. Soc., vol. 45(1939), pp. 335-354 
3. Annals of Math., vol. 40(1939), pp. 328-338. 
L. WEISNER. 
1. Trans. Am. Math. Soc., vol. 38(1935), pp. 474-484. 
2. Trans. Am. Math. Soc., vol. 38(1935), pp. 485-492. 


CALIFORNIA INSTITUTE OF TECHNOLOGY. 


371 











MODULAR FIELDS. I 
SEPARATING TRANSCENDENCE BASES 


By SAuNpDERS Mac LANE 


1. Introduction. Any extension K of a given field L has a transcendence 
basis 7’ over L, that is, a set of elements 7 = {t,, t, --- } algebraically inde- 
pendent over L and such that all elements of K are algebraic over 7. In other 
words, K can be considered as a (possibly infinite) field of algebraic functions 
of the variables t, , t2, --- . Many properties of algebraic equations must be 
restricted to separable equations, without multiple roots, so we enquire: When 
does a field K have over a subfield L a “‘separating”’ transcendence basis T such 
that all elements of K are roots of separable algebraic equations over T? 

Forms of this question arise in the analysis of intersection multiplicities for 
general algebraic manifolds (B. L. van der Waerden [13]'), in one method of 
discussing the structure of complete fields with valuations (Hasse and Schmidt 
[4], p. 16 and p. 46), and in the study of pure forms over function fields (Albert 
[2]). The properties of such separating transcendence bases may be also con- 
sidered as one part of a systematic study of the algebraic structure of fields of 
characteristic p. 

A first result, obtained independently by van der Waerden (({13], Lemma 1, 
p. 620) and by Albert and the author ((2], Theorem 3), is 

THEOREM 1. Any field K obtained by adjoining a finite number of elements toa 
perfect field P has a separating transcendence basis over P. 

A proof is given in §3 below. The fields treated in this theorem might also 
be described as finite algebraic function fields of n variables over P, for any 
integer n. A similar result for a more general ground field is (proof in §/, 
Theorem 14, Corollary) 

TueoreM 2. If L is a function field of one variable over a perfect coefficient 
field P, and if K is obtained by adjoining to L a finite number of elements in such 
a way that every element of K algebraic over L is in L, then K has a separating 
transcendence basis over L. 

The hypothesis that K is generated over L by a finite number of elements is 
essential to this theorem. For more general fields K there is a relation between 
the structure of K and that of its subfields over P. 


Received January 13, 1939; presented to the American Mathematical Society, December 


29, 1938. 
1 Numbers in brackets refer to the bibliography at the end of the paper. 


372 








su 





mber 














MODULAR FIELDS 373 


THEOREM 3. If a field K has a finite separating transcendence basis over a 
perfect subfield P, then any intermediate field M, such that K > M 2 P, also has 
a separating transcendence basis over P. 


The proof is given in §3. 

Our chief purpose is to obtain further theorems of this type for an extension 
K/L in which the base field L is not restricted to be a perfect field P. Thus we 
obtain in Theorem 13 of §7 necessary and sufficient conditions that a given 
subset 7’ be a separating transcendence basis for a given extension K/L, and 
also necessary and sufficient conditions that a given extension K/L have some 
finite separating transcendence basis (Theorem 14 in §7). 

Extensions K/L with separating transcendence bases are special instances 
of extensions K/L which “preserve p-independence”. The notion of the p- 
independence of subsets of L, due to Teichmiiller ({10], §3), is formulated in §4. 
Those extensions K/L which “‘preserve”’ this p-independence, in the sense that 
p-independent subsets of L remain p-independent in K, can be characterized 
ina large number of different ways (Theorem 7 in §4, Theorem 10 of §5, Theorem 
16 of §7). They arise naturally here and also in the study of the relative struc- 
ture of discrete complete fields with valuations (Mac Lane [6]). Such extensions 
K/L which preserve p-independence seem to have most of the properties apper- 
taining to arbitrary extensions of perfect base fields. An instance is the highly 
useful possibility of reducing certain equations involving only p-th powers of 
elements of K (see Theorem 10 of §5). 

Extensions preserving p-independence provide a natural tool for investigating 
separating transcendence bases, largely because, given an extension K/L which 
preserves p-independence, any intermediate extension M/L with M C K must 
automatically preserve p-independence. This analysis yields in §8 a theorem 
on separating transcendence bases for intermediate fields, which requires no 
mention of p-independence in its formulation, and which states that Theorem 3 
above is valid without the hypothesis that the base field P be perfect. 

In terms of p-independence we introduce for any extension K/L certain 
“relative p-bases’’ closely related to separating transcendence bases. In §9 
we show that the notion of a p-basis can be used to generalize a theorem of 
A. A. Albert on pure forms, and we give another property of p-bases. 

The earliest result on separating transcendence bases is one of F. K. 
Schmidt’s [8] who proved that a function field of one variable over a perfect field 
P has a separating transcendence basis over P. His proof (§3 below) gives 
also the slightly more general statement: 


TuroreMm 4. If a field K has transcendence degree 1 over its maximal perfect 
subfield P, then K has a separating transcendence basis over P. 


It would be tempting to conjecture that this theorem remains true without 
the restriction that the transcendence degree of K/P is 1. That this conjecture 
is false we show by a somewhat involved example given in §10. The difficulty 
of the example resides chiefly in the problem of explicitly computing the maximal 











374 SAUNDERS MAC LANE 


perfect subfield of a field specifically given. The method used in this example 
(and also two corollaries to Theorem 19 of §8) throws some light on this general 
problem. Our example also provides an instance of a field whose “degree of 
imperfection”, in the sense of Teichmiiller, differs from its transcendence degree 


over its maximal perfect subfield. 


2. Preliminaries. Notation. All fields to be considered will have char- 
acteristic a fixed prime p. If K is such a field, K”’ denotes the field of all p"-th 
powers of elements from K. Similarly, if 7 is any set of elements, 7” is the 
set of all p-th powers of elements of T, ete. If a field K and certain sets S, T, ... 
are all contained in a larger field, then AK(S, 7, ---) denotes the field obtained 
by adjoining to K all elements of S, of T, ete. If K contains a subfield L, 
properties of K relative to the ground field L will be called properties of the 
“extension” K/L. K 1 M denotes the intersection of the fields K and M, 
S U T the union of the sets S and 7, {t} the set whose only element is ¢. 

Derinitions. Let K/L bea givenextension. Anelement b of K is separable 
over L if b satisfies a polynomial equation f(r) = 0, with coefficients in L, having 
no multiple roots. If bin K has some power b”’ in L, then b is purely inseparable 
over L. The extension K/L is separable (or, purely inseparable) if every b in K 
is separable (or, purely inseparable) over L. A set T C K is algebraically 
independent over L if no element t of T is algebraic over the field L(T — {t}), 
where 7’ — {t} denotes the set 7’ with ¢ deleted. A transcendence basis T for 
K/L is a subset of K algebraically independent over LZ such that K is algebraic 
over L(T). <A separating transcendence basis for K/L is a transcendence basis 
T such that K is separable over L(T). It can be shown (Mac Lane [5], Theorem 
2.1) that K/L has a separating transcendence basis if and only if the elements 
of K can be so well ordered that each element is either transcendental or sepa- 
rable over the field obtained by adjoining to L all prior elements in the order. 
The transcendence degree for K/L is the (cardinal) number of elements in any 
transcendence basis for K/L. 

A polynomial f(x; , --- , 2») with coefficients in K may involve some variable 
zr, only as powers of zf’ ; the polynomial f is then called inseparable in 2, . There 
is a largest integer e such that f can be written as a polynomial g(z]’, x2, - - - , Zn) 
involving x; only as a p’-th power; this largest p* is called the exponent of x; inf. 
(If x, fails to appear in f, use p’ = ~.) If b is inseparable over a field L, the 
irreducible equation f(z) = 0 for b over L is known to be inseparable (exponent 
p’ > 1)inz. 

\ field P is perfect if P”? = P, that is, if each element of P has a unique p-th 
root in P. The maximal perfect subfield of a given (imperfect) field K is the 
intersection K”” of all the fields K”” (n = 1, 2, 3, --- ). 

A known result is (Teichmiiller [10], Theorem 12; Mac Lane [5], §2) 


LemMMA 1. An element b which is both separable and purely inseparable over @ 
field L lies in that field. 


2 Cf. Albert [1], Chapter 7, or B. L. van der Waerden [12], Chapter 5. 














dle 


i K 
ally 
t}), 

for 
raic 
asis 
rem 
ants 
“pa- 
der. 
any 


able 
here 
In) 
inf. 
the 
nent 


p-th 
- the 


ver a 











MODULAR FIELDS 375 


3. Bases over perfect ground fields. Schmidt’s Theorem for function fields 
of one variable over a perfect base field may be stated in a form including 
Theorem 4 of the introduction thus: 

THeoreM 5. If an imperfect field K has transcendence degree 1 over a perfect 
subfield P, then K has a separating transcendence basis over P. 

Proof. Let t be a transcendence basis for K/P. If every root t” © were in K, 
K would be algebraic over the perfect field P(t, t” ' t? *, ---), so K itself would 
be perfect,’ counter to assumption. Let e be the largest integer for which 
u=t” ‘isin K. Then u”” is not in K, and uw is a transcendence basis for K/P. 
It is also a separating transcendence basis for K/P unless K contains an z in- 
separable over P(u). For such an 2 the irreducible equation g(x”, u) = 0 over 
P involves only p-th powers of z, hence can be written as the p-th power of a 
polynomial over P(u'’”). Thus the adjunction of u’” to P(u) reduces the 
degree of x over P(u) by a factor p. This means that uw” must be in P(u, 2) 
Cc K, contrary to the choice of e. 

A typical necessary and sufficient condition for the existence of a separating 
transcendence basis is 


THeoreM 6. A field K with a finite transcendence basis T over a perfect sub- 
field P has a separating transcendence basis over P if and only tf there is an integer 
e for which K”* is separable over P(T). 

Proof. If K does have a separating transcendence basis X over P, each 
element x of this basis has some power x””’ separable over P(T). X, like 7, 
has but a finite number‘ of elements, so we can choose e as the largest such 
exponent e’ and have X”’ separable over P(T). As K is separable over P(X), 
K” is separable over P(X”’) and hence over P(T), by the transitivity of sepa- 
rability.” The necessity of our condition is thereby established. 

Conversely, let e¢ be the smallest integer such that K”’ is separable over P(T). 
Ife = 0, T itself is a separating transcendence basis, and we are done. Assume 


then that e > 0. We propose to replace T = {t,,--- ,t,} by a separably 
“equivalent” basis {uf’, ue, --- , ua} in which one element is a p-th power uf ; 
for then the basis uw, --- , u, With uw, replacing uf will “separate’’ more of K. 


Since e was chosen as small as possible, there is in K an element y such that 
y” but not y” ' is separable over P(T). The irreducible equation f(y, 7) 
= 0 for y over P(T) then involves y only in the form y’”’; i.e., has exponent p* 
iny. Wecan also assume that f(y, 7) is irreducible in the ring of all polynomials 
iny and T with coefficients in P. All these coefficients are p-th powers in the 
perfect field P. If every variable ¢t of T had exponent p or greater in f, f would 
contain only p-th powers, so could be written as a p-th power of another poly- 
nomial, in contradiction to its assumed irreducibility. If ¢ is one of the variables 


*Steinitz [9], §13, no. 4. 

‘The number of elements in a transcendence basis of K/P is an invariant of K/P. See 
Steinitz [9], §23. 

* Steinitz [9], §13, no. 9. 








376 SAUNDERS MAC LANE 


of 7 which actually appears in f with exponent 1, we can consider f(y, 7) = 
f(y, t, T — {t}) = 0 as an equation for ¢ over P(T, y’’). According to the 
Gauss Lemma, this equation is irreducible, while it has exponent 1 in ¢t, hence 
is separable. Therefore ¢ is separable’ over P(7’ — {t}, y’’). 

The set 7; = (7 — {t}) U {y} obtained from 7' by replacing ¢t by y is thus 
another transcendence basis for K/P, such that every element separable over 
P(T) is separable over P(7,), by the transitivity of separability (see footnote 5). 
Furthermore y is not separable over P(7') but is separable over P(7,). The 
subfield Ky, of all elements 6 of K separable over P(7;) is thus larger than the 
subfield Ky of all b separable over P(7’). Because T is finite, the degree of K 
over this original subfield Ky is finite,’ so a repetition of this transition from 
T to T, will finally yield a basis T,,, over which all of K is separable. 

Theorem 1 of the introduction is an immediate corollary of this theorem. 
Theorem 3 concerning intermediate fields is also a corollary. For let K D L 
> P, where K has a finite separating transcendence basis over P. Pick a tran- 
scendence basis X for L/P and a similar basis Y for K/L. Then the union 
T = X U Y is a transcendence basis for K/P. The necessary condition of 
Theorem 6 applied to this basis 7 then shows that the sufficient condition of 
Theorem 6 must be satisfied by the original basis XY. Hence the intermediate 
field L does in fact have a separating transcendence basis over P. 


4. Extensions preserving p-independence. With any given extension K/L 
there is related in an invariant fashion a purely inseparable extension K/L(K’). 
The latter may be analyzed by the concepts of p-independence and _ p-basis 
introduced by Teichmiiller in §3 of [10]. A subset X of K is relatively p-inde- 
pendent in K/L if K*(L, X’) is a proper subfield of K’(L, X) whenever X’ isa 
proper subset of X. A relative p-basis B for K/L is a relatively p-independent 
set such that K = K*(L, B). This notion of p-independence has the usual 
properties of an abstract dependence relation.” Therefore, every extension 
K/L has a relative p-basis and any two relative p-bases for the same extension 
have the same (cardinal) number of elements. Furthermore any set relatively 
p-independent in K/L can be embedded in a p-basis for K/L. 

A subset X of K is (absolutely) p-independent if X’ < X implies K’(X’) 
< K”(X), where < denotes proper inclusion. An (absolute) p-basis B for K 
is a p-independent subset for which K = K’(B). This is the special case of 
the above “relative’’ definitions obtained by assuming L perfect, for then 
L = L’,L Cc K”, and the field K’(L, X) used above becomes’ K’(X). Ex- 


6 The argument underlying this exchange—from y”* separable over P(T — {t}, ¢) tof 
separable over P(7T — {t}, y”°)—can be stated generally; cf. Lemmas I and II in Mae 
Lane [5]. 


7 By Steinitz [9], Theorem 3 in §13 one proves Kr C K CKr(T” “), where K7(T” ‘)is 
certainly an extension of finite degree over Kr . 

* Van der Waerden [12], p. 204, or Mac Lane [7], §6. The results stated above are given 
by Theorem 12 of [7] applied to the extension K/K?(L). 

* Teichmiiller’s paper [10] considered chiefly this special case. 








b 

















MODULAR FIELDS 377 


tensions which preserve the (absolute) p-independence of subsets of L will now 
be characterized in several different ways. 


THEOREM 7. Any two of the following properties of an extension K/L are 
equivalent : 

(i) Every set X C L p-independent in L is p-independent in K. 

(ii) There is a p-basis B of L which is p-independent in K. 

(iii) L’(S) = L N K°(S) for every finite subset S C L. 

(iv) L’(L”) = L fA L’(K”) for every subfield L’ < L. 


Condition (iv) on the intersection L f L’(K”) states in effect that the ad- 
junction to L’ of p-th powers of elements of K yields no elements in L not obtain- 
able by adjoining p-th powers from L. 


DeFINITION. Any extension K/L with one of the equivalent properties (i), 
(ii), (iii), or (iv) will be said to preserve p-independence. 


Proof. That (i) implies (ii) is trivial, so consider (ii) — (i). Were some 
p-independent subset X of L not p-independent in K, there would be an element 
zin X contained in K’(X’), where X’ is a finite subset of X not containing z. 
Since B of (ii) is a p-basis of L, there is a finite subset B; C B such that z and 
X’ are in L”(B,). This means that X’ U {zx} is p-dependent on B,. In 
such circumstances one can exchange the elements xz and 2’ « X’ successively 
with suitable elements” of B,, until all of B, is p-dependent on X’, x and a 
remaining subset B, C B,. This means that 


(1) L’(B,) = L?(X’, x, Bs) 


and that the combined set X’ U {x} U By is p-independent in L. The two 
finite sets B, and X’ U {zx} U B, are mutually p-dependent over L and have 
the same number of elements, by construction. But the subset B, of B is 
assumed to remain p-independent in K, while the other subset X’ U {x} U B, 
is p-dependent over K because we supposed z to be in K’(X’). This is a con- 
tradiction since this makes K?(B,) have a degree p” over K”, if 8; is the number 
of elements in B, , while the equal field K’(X’, x, B.) = K”(X’, Bz) has a smaller 
degree over K”. Therefore (ii) — (i) in our theorem. 

To demonstrate (iii) — (i), suppose counter to (i) that some p-independent 
set XY of L becomes p-dependent in K, so that again some x « K”(X’). There- 
fore x is in the intersection L NM K’(X’) = L?(X’), by (iii). This result states 
that x is p-dependent on X’ over L, counter to assumption. 

The implication (iv) — (iii) may be obtained trivially by setting L’ = P(S), 
where P is used to denote some perfect subfield of L. 

Finally, to prove that (i) — (iv), let L’ be any subfield of L and pick a relative 
p-basis T for L’(L”)/L”. Then L’(L”) = L?(T). If the conclusion L NM L’(K”) 


° By the “‘exchange’”’ property for dependence relations: If x depends on C and d, 
but not on C alone, then d depends on C and z. Cf. Teichmiiller [10], §3 or Mac Lane [7], 
§6 and Axiom (£;). 





378 SAUNDERS MAC LANE 


= L’(L”) of (iv) were false, there would be an element y of L in L’(K”) = 
K’(L’) = K*(T) but not in L’(L”) = L?(T). Thus y is not p-dependent on the 
set T in L, while T is by construction p-independent in L, so that the usual 
properties of dependence make" 7 U {y} a p-independent subset of L. Con- 
dition (i) then insures that 7 U {y} is p-independent in K, in conflict with the 
previous assertion that y is in K’(7T). Hence (i) — (iv). The various impli- 
cations (ii) <> (i) — (iv) — (iii) — (i) completely establish the theorem. 

Such an independence-preserving extension might be viewed as a general- 
ization of the ordinary separable algebraic extensions, in the following sense: 


THEeorEM 8. If K is an algebraic extension of L, then K/L preserves p-inde- 
pendence if and only if K/L is separable. 

Proof. If K/L is separable, each p-basis of L remains a p-basis of K (see 
footnote 14), and this insures that p-independence is preserved. Conversely, 
suppose that K/L preserves p-independence but is not separable, and denote by 
K, the subfield consisting of those elements of K which are separable over L. 
Then K > K, and K/K, is purely inseparable,” so K contains a c not in K, 
with c? in K,. Any p-basis B of L is also a p-basis of K, , so c” is p-dependent 
on B, thus lies in K?(B). The exchange property (see footnote 10) of p-depend- 
ence provides for an exchange of c” with some be B, with the result that 
be K?(B — {b}, ce”) C K*(B — {b}). This states that the set B is not p 
independent in K, in violation of the assumption that the extension K/L pre- 
serves the p-independence of B. Hence the theorem is proved. 

Other examples of extensions which preserve p-independence will now be cited 
(cf. Mac Lane [6], §6), but for our purposes it is especially important to note 
that any extension of a perfect field always preserves p-independence. 


THEOREM 9. (a) An extension K/L preserves p-independence if L is perfect 
or if K/L has a separating transcendence basis. 

(b) If L is an extension of transcendence degree 1 over a perfect field P, then 
an extension K/L preserves p-independence if and only if no element of K is in- 
separable and algebraic over L. In particular, K/L preserves p-independence if 
L is relatively algebraically closed” in K. 

Proof. We first prove (a). If ZL is perfect, each p-basis of L is void, hence 
necessarily remains p-independent in any K. If K/L has a separating trans 
cendence basis and if B is a p-basis of L, then B is part of a p-basis™ of K, hence 
does remain p-independent in the extension K/L. 

Part (b) refers especially to fields L which are function fields of one variable 


11 Mac Lane [7], corollary to Theorem 2. 

12 Steinitz [9], §14, Theorem 1. 

13 A subfield L of K is relatively algebraically closed in K if every element of K algebraic 
over L lies in L. 

4 By the following theorems of Teichmiiller [10], §3: /f K is a separable algebraic 
extension of L, any p-basis of L is a p-basis of K. If L(T) is a purely transcendental ez- 
tension of L by algebraically independent elements T, then B U T is a p-basis of L(T) if B 
is a p-basis of L. Both can be proved readily from the appropriate definitions. 























MODULAR FIELDS 379 


over a perfect coefficient field P. Suppose first that K contains no element 
inseparable over L. If L were perfect, K/L would preserve p-independence 
by part (a), so we can assume that L is imperfect. By Theorem 5, L/P then 
has a separating transcendence basis of one element, ¢. This element is also 
(see footnote 14) a p-basis of L, so that if K/L were not to preserve p-indepen- 
dence, t would be in K’. Then ¢’” is in K, but is inseparable over L, contrary 
to the assumed character of L. Hence K/L preserves p-independence. 

Conversely, suppose that K/L does preserve p-independence but that some 
b in K is inseparable over L. We can again suppose L imperfect and ¢ a sepa- 
rating transcendence basis for L/P. The irreducible equation f(y, t) = 0 for 
y = b over Pit] then has an exponent p* > 1 in y, but because of its irreducibility 
has exponent 1 in ¢. Viewed as an equation” for ¢ over P{y], it shows that ¢ is 
separable over P(b”’) C K”. But ¢ is also purely inseparable over K”, hence 
tis in K”, in conflict with the hypothesis that K/L preserves the p-independence 
of the p-basis {t}. This completes the proof of part (b) of the theorem. 


5. Equations involving p-th powers. The essential intrinsic property of 
extensions K/L which preserve p-independence is the possibility of reducing 
those algebraic equations between elements of K which involve only coefficients 
from L and p-th powers of elements from K. This includes a known, simple 
property of perfect fields L. 


THEOREM 10. A necessary and sufficient condition that an extension K/L 
preserve p-independence is that, for every finite subset Y of K, the linear dependence 
over L of the set Y” implies the linear dependence over L of the set Y itself. 


Proof. Suppose first that K/L preserves p-independence, and that the 
elements y: , --- , ¥m Of some set Y have their p-th powers linearly dependent 
over L. Select a p-basis B of L, so that L = L”(B), and choose a finite subset 
U Cc B with the property that y?, --- , yh are linearly dependent over L?(U). 
If this is true for U’ the null set, y: , --- , ym are certainly linearly dependent 
over L. Otherwise we can successively delete elements from UU till we find a 
new (”’ and an element u such that yf, --- , yh are linearly dependent over 
L*(U’, u) but not over L?(U’). Since u has degree p over L’(U"’), the given 
linear dependence relation may be expressed as 


m p—1 
} (= bu!) yi =0, bye L7(U), 
i=l j7=0 
where not all b;; = 0. Therefore 

p-l m 
(1) = (= buy?) wv =0. 

7=0 i=l 


If one of the coefficients }> b;,y? in this equation is not 0, (1) is a separable 
equation for u over K”(U’), so that u is p-dependent on U’ in K, in violation 


’ 


** Compare the ‘exchange’ 


argument for Theorem 6. 





380 SAUNDERS MAC LANE 


of the hypothesis that K/L preserves p-independence. If the coefficients of all 
powers wu’ in (1) are zero, any one of these coefficients which involves a b;; # 0 


provides a linear dependence between yf , --- , yh over L”(U’), in contradiction 
to the choice of u. Hence y, --- , Ym are linearly dependent over L. 


Conversely, suppose that the linear dependence of a set Y” always implies 
that of Y, but that K/L does not preserve p-independence. Then any p-basis 
B of L must become p-dependent in K, so that some b is in K"(b,, --- , by), 
where b, b;,---,6, are distinct elements of B. The algebraic extension 
K*(b,, --- ,6,) has over K” a linear basis consisting of all power products 
c = bj' .-- b& with exponents ge = 0,1,---,p — 1. If 1, --- ,¢m are all 
such power products, b e K’(b; , --- , b») means that there are elements y; not 
all zero in K for which 


(2) b+ yicr + --- + Yyncm = 0. 


Among the elements 1, y? , --- , y2 of K” pick a linearly independent basis, 
over L”, consisting of 1, z? , --- , 2? , so that each y? may be written as 


yi = dio + dhe? + --- + dizi, di; in L. 


If these expressions are substituted in (2) and the coefficients of each z? col 
lected, one finds 


(3) (: +> dines) + (2 aise. ff +--+ (Sa: «) z? = 0. 


i=l t=1 i=l 


Here the first term b + >> d/oc; cannot be zero, because b is not p-dependent 


on b; , --- , b, over the original field L. Therefore (3) asserts that 1, 2? , ---,2? 
are linearly dependent over L. The hypothesis of the theorem then shows that 
1, 2:1, ---, 2 are linearly dependent over L. It therefore follows that 1, 27, 


-, 2? are linearly dependent over L’ and the construction of the z”’s as 
linearly independent over L’ is contradicted. This implies that the p-basis B of 
L remains p-independent, which is to say that the given extension does preserve 
p-independence. 

LemMA 2. Let the extension K/L preserve p-independence, and let quantities 
ti, , --- , t, be algebraically independent over L, while to in K is algebraically de- 
pendent ont, , --- , t, according to a relation f(ty , t: , --- , tn) = 0, with coefficients 
in L. If f is irreducible over L as a polynomial in the variables to , --- , tn , then 
f necessarily has exponent 1 in at least one of these variables. 

Proof. The conclusion states in effect that such an irreducible f cannot be 
a polynomial in the p-th powers ¢) ,---,¢2. If this were the case, f would 
be a linear relation between the p-th powers of a certain set of distinct power- 
products y; 


S(to, an a = > ay? = 0, yi = Go oes . (¢ = j, ~++ mM), 
i=l 














sis, 


 L. 
col- 











MODULAR FIELDS 381 


where all the coefficients a; + 0. This linear dependence of yj , --- , y2 over 
L implies by Theorem 10 a linear dependence of y: , «++ , Ym : 
g(to, «+, t) = Do biys = 0, b; in L. 
i=1 


Not all b; = 0, so some ¢; , say ¢, , must actually appear ing. The degree d of f 
in this quantity is then at least p times the degree of g int, . By the Gauss 
Lemma d is the degree of the element ¢t, of K over the field L(t, --- , tn-1), 
although g = 0 provides an equation of smaller degree for t, over that field. 
With this contradiction to the assumption that f contained only p-th powers 
the lemma is established. 


6. Relative p-bases. Further preliminaries are necessary for the subsequent 
exposition of the close connection between relative p-bases of an extension 
K/L in the sense of §4 and separating transcendence bases for the same ex- 
tension. A first result is the algebraic independence of p-bases, which was 
established by Teichmiiller ({10], Theorem 15) for the case of absolute p-bases. 


THEOREM 11. The elements of any relative p-basis B of an extension K/L which 
preserves p-independence are algebraically independent over L. 


Proof. Were the elements of B algebraically dependent, one could find ele- 
ments &, t:,---,¢, in B algebraically dependent but with ¢,, --- ,t, alge- 
braically independent over LZ. An irreducible polynomial relation f(j, --- , 
t,) = 0 between these quantities must then as in Lemma 2 contain one variable, 
say t, , of exponent 1. This equation provides an irreducible and separable 
equation for ¢, over the field L(t), --- , tn) C K”’(L, ty. --- , tn). Over the 
latter field K’(L, to, --- ,tn-1), ta = (t2)"’” is aiso purely inseparable, so that 
t, must lie in this field (Lemma 1, §2). This conclusion makes ¢, relatively p- 
dependent on &, --- ,t,-1, contrary to the hypothesis on B D {t, --- , th}. 

Explicit relative p-bases can be found from absolute p-bases in specific cases 
by the following process of composition and decomposition. 


THEeoreM 12. If an extension K/L preserves p-independence and if B and C 
are disjoint subsets of K with C C L, then any two of the following statements 
imply the third: 

(i) Bisa relative p-basis of K/L; 

(ii) C is a p-basis for” L; 

(iii) B U C is a p-basis for” K. 

Proof. We prove first that (i) & (ii) — (iii). The union B U C might be 
p-dependent in two ways. In the first place, an element 6 of B might lie in 
K’(B — {b}, C) c K”(L, B — {b}), but this would violate the relative p-inde- 


’’ From this theorem it is also possible to obtain a similar but more general theorem 
in which statements (ii) and (iii) concern relative p-bases for L/M and K/M respectively, 
where K > L > M and the extension L/M is assumed to preserve p-independence (Theorem 
12 is the case when M is perfect). 








382 SAUNDERS MAC LANE 


pendence of the set B. In the second place, an element c of C might lie in 


K’(B, C — }{e}). There then are distinct elements b; , --- , 6» in B such that 


(1) ce K"(b,---, bm, C — fe}) 


and such that the statement (1) would be false were any b; omitted. At least 
one b; is present in (1) (m 2 1) because C is known to be p-independent in L 
and therefore in K. Over the smaller field K’(b., ---,bn, C — {e}) both 
c and b; have the degree p, so (1) yields an “‘exchanged”’ statement 


bi e K’(c, be, --- , bm, C — {e}) = K"(be, --- , bn, C). 


This type of p-dependence has already been led to a contradiction, so B U C 
is in fact p-independent in K. That it forms a p-basis for K is then readily 
shown. 

The converse implication (ii) & (iii) — (i) is trivial, granted the hypothesis 
BC =0. As for the third implication, (i) & (iii) — (ii), the p-independence 
of the set C in L results at once from its assumed p-independence (hypothesis 
(iii)) in the larger field K. Were C not a p-basis of L, there would be in L 
an x not in L’(C). By (iii), z is in K’(B, C), so there are distinct elements 
b,, --- , bm in B such that 


(2) zeK"(b,,---,b2,C), 
but such that the statement (2) would be false were any b; omitted. If m > 0, 
one deduces as in the previous argument from (1) that b; « K’(z, be, --- , bm, C) 


c K(L, be, --- , bm), a violation of the relative p-independence of B in K/L 
(hypothesis (i)). Thus (2) is x « K’(C), although we had assumed z e L’(C) 
false. This states that the p-independent set (see footnote 11) C U {xr} of L 
has become p-dependent over K, contrary to the basic assumption on this 
extension K/L. The theorem is thereby completely proved. 

We can now describe particular relative p-bases in two typical extended fields 
(involving transcendental, separable, and inseparable adjunctions). 

LemMa 3. A separating transcendence basis S for an extension K/L is always 
a relative p-basis for K/L. 

For, any p-basis B of L gives rise to a p-basis (see footnote 14) B U S for 
K and this, by the decomposition Theorem 12, makes S a relative p-basis 
for K/L. 

Lemma 4. If K D Ky > L, where K/Ko is a finite purely inseparable ez- 
tension and K/L preserves p-independence, then the (cardinal) number of elements 
in a relative p-basis for K/L is the same as the number of elements" in a relative 
p-basis for Ko/L. 

Proof. It suffices to consider the case when the degree [K : Ko] is p, so that 
K = K,(a), where a” isin Ky. We wish to construct for K/L a relative p-basis 


‘17 For the absolute case (L perfect) this lemma has been proved by M. Becker [3]. 














MODULAR FIELDS 383 


containing this element a; to this end we show first that a” is relatively p-inde- 
pendent in Ko/L. Otherwise a” would be in K?(L) = K@(B), where B is a 
p-basis for L. Since a” is not in K@ , a” can here be exchanged with an element 
b of B, which means that b e Kj(a”, B — {b}) C K?(B — {b}), contrary to the 
p-independence of B in L and K. 

Since a” is relatively p-independent in Ko/L, there is a relative p-basis Co for 
K,/L containing a”. The replacement of a” by a in the set Cy yields, as may be 
verified from the definitions, a p-basis C for K/L. C and Cy have the same 
number of elements, so our lemma is proved. 


7. Criteria for separating transcendence bases. The notions of p-inde- 
pendence will now be applied to obtain two types of theorems: first, necessary 
and sufficient conditions that a given set 7 be a separating transcendence 
basis for a given extension K/L; secondly, necessary and sufficient conditions 
that there exist some separating transcendence basis for a given extension. 


TueoreM 13. If the extension K/L preserves p-independence, then a subset 
T C K is a separating transcendence basis for K/L if and only if T is both a tran- 
scendence basis for K/L and relatively p-independent in K/L. 


Proof. Lemma 3 insures that any separating transcendence basis has the 
two specified properties. Conversely, suppose that some 7 with these two 
properties is not a separating transcendence basis for K/L. Then some b of K 
is not separable over L(7'), hence satisfies for y = b an irreducible polynomial 
equation f(y, 7’) = 0 with exponent at least p in y and with coefficients in L. 
At least one variable t of T must (by Lemma 2 of §5) have exponent | in this 
polynomial f. As in the “exchange’’ argument for Theorem 6, we can regard 
fly, T) = 0 as an irreducible and separable equation for t over L(T — {t}, 6”), 
for f involves y = b only as y”’. Therefore ¢ is separable over the larger field 
K*(L, T — {t}). But tis also purely inseparable over this field, so, by Lemma 1, 
{must lie in the field K’(L, T — {t}). This conclusion states that ¢ is relatively 
p-dependent on 7’ — {t}, contrary to the hypothesis that 7 is relatively p-inde- 
pendent. Hence 7’ must be a separating basis. 


TueoreM 14. [If the field K has a finite transcendence basis T over its subfield 
L, then K has a separating transcendence basis over L if and only if 

(i) K/L preserves p-independence ; 

(ii) for some integer e, L(K”’) is separable over L(T). 

Proof. That condition (i) is necessary was established in Theorem 9(a), 
while the necessity of (ii) results from the finiteness of the set 7 exactly as in 
Theorem 6 of §2. 

Conversely, suppose that (i) and (ii) hold, and pick a relative p-basis B for 
K/L. According to (i) and Theorem 11 the elements of B are algebraically 
independent over L, so that we can” embed B in a transcendence basis B U X 


"In any (abstract) dependence relation, an independent subset can be enlarged to 
form a maximal independent subset. Mac Lane [7], Theorem 3. 








384 SAUNDERS MAC LANE 


for K/L. From the assumed separability of L(K’’) over the original transcend. 
ence basis T one finds, since T is finite, a larger integer f such that L(K”) Is 
separable over L(B, X). Thence we deduce successively the separability oj 
each of the following extensions 


(1) L°(K”™**)/L?(B’, X”); LUK™*")/L(B’, X”); K””""(L, B)/L(B, X?). 


But B is a relative p-basis for K/L, hence K = K’(L, B) = K"(L, B) =... 
= K”*'(L, B). Thus (1) states that K is separable over the basis B U X’ 
which is obviously a transcendence basis because B U X is by construction a 
transcendence basis. Thus we have found, as required, a separating trap 
scendence basis” B U X” for K/L. 


Corotuary. If L is a field of transcendence degree 1 over a perfect subfield P. 
and if K is an extension of L of finite transcendence degree, containing no element 
inseparable and algebraic over L, then K has a separating transcendence basis over L. 


This follows at once from Theorems 14 and 9(b); it includes Theorem 2 o 
the introduction as the special case when L is a function field of one variable 
over P. It is also possible to prove this corollary without using the notion of 
p-independence, by a suitable extension of the exchange process used for The 
orem 6. The hypothesis that L is a function field of only one variable is essential 
to this theorem; for suppose instead that L = P(z, y) is a rational function 
field of two independent variables z and y over a perfect field P, and consider 
the extension K = L(z, u), where z is transcendent over L and uw is a root of the 
inseparable equation u” = y + xz”. Any separating transcendence basis for 
K/L would consist of a single element ¢. Let f(u, t) = 0 and g(z, t) = Ob 
respectively the separable irreducible polynomial equations for u and z over 
L{t], of respective exponents p* and p* in t. Then u is separable over L(t’) 
z is separable over Lit”), and ¢” is separable over L(z). If a = 8, wis separable 
over L(z), although the given equation u” = y + zz” is inseparable. A similar 
contradiction arises if 8 => a. Hence this extension K of a function field L of 
two variables can have no separating transcendence basis ¢. 

The close and natural relation between extensions preserving independenct 
and extensions with separating transcendence bases will now be documented 
with a pair of theorems, one of which gives a necessary and sufficient condition 
for the existence of a separating basis, while the other gives conditions for the 
preservation of p-independence. 


THEOREM 15. Let K be a field obtained by adjoining a finite number of elements 
to L. Then K/L preserves p-independence if and only if K has a separating 
transcendence basis over L. Furthermore, if K/L does preserve p-independents, 
then a subset T of K is a separating transcendence basis for K/L if and only if i 
is a relative p-basis for K/L. 


1° B alone is a transcendence basis, for X must be void. Any z in X is in K, hence by 
(1) is separable over L(B, X”), although z manifestly satisfies the inseparable equation 
z? — x? ="0 over L(B, X”), a contradiction to the assumption that X is not void. 











nscend- 

rps. « 
KK”) is 
ility of 


x”). 


) a oan 
Us 
iction a 
g tran 


field P, 
elements 
s over L. 


om 2 of 
variable 
tion of 
or The 
ssential 
unction 
-onsider 
it; of the 
asis for 
= Ob 
2 over 
L(t’), 
parable 
similar 
ald Ld 


endence 
mented 
yndition 

for the 


elements 
parating 
endente, 
nly if 


hence by 
equation 





MODULAR FIELDS 385 


Proof. We know that the presence of a separating transcendence basis S 
for K/L makes K/L preserve p-independence (Theorem 9(a)) and makes S 
a relative p-basis (Lemma 3 in §6). Thus we need only consider a relative 
p-basis T for an extension K/L which does preserve p-independence, and prove 
that T is a separating transcendence basis. If X is any transcendence basis 
for K/L, and if K, is the subfield of all elements of K separable over L(X), 
then X is a separating transcendence basis for K,/L, hence a relative p-basis 
for K,/L (Lemma 3 in §6). But K must be a purely inseparable extension of 
K,, so that K/L has by Lemma 4 of §6 a relative p-basis Y consisting of exactly 
m elements, where m is the number of elements in X. The two p-bases Y and 
T both have the same number of elements,” so that we have in 7 a set of ele- 
ments algebraically independent (Theorem 11) and equal in number to the 
number m of elements in a transcendence basis X. Therefore 7’ is also a tran- 
seendence basis for K/L, so must be a separating basis by the sufficient con- 
dition of Theorem 13. Theorem 15 is established. 


CoroLtLaRy. If an extension K/L has a finite separating transcendence basis; 
then any relative p-basis for K/L is a separating transcendence basis for K/L. 


Example. The hypothesis that the transcendence degree of K/L is finite is 
necessary for the validity of this corollary, even if we restrict the base field 
L to be perfect, as may be seen by the following example. Let P be a perfect 
field over which the elements & , x , 2, 23 , --- are algebraically independent, 
and introduce the quantities ¢, successively as the roots of the (inseparable) 
equations 


th = tri + In (n = 1, 2,3,---). 
K is the field P(2,, x2, --- ,t0, th, te, --- ) generated by all these quantities, 
and T = {&,t, tg, --- } is a separating transcendence basis for K/P. Fur- 
thermore, the set X = {2,, 22, --- } can be shown to be a p-basis for K, hence 


arelative p-basis for K/P. Nevertheless, this p-basis is not even a transcend- 
ence basis, so is certainly not a separating transcendence basis. 


THEoreM 16. An extension K/L preserves p-independence if and only if 
L(y, , «++ , Yn) has a separating transcendence basis over L for every finite set of 
elements y: , --- , yn from K. 


Proof. If K/L does preserve p-independence, the extension from L to the 
subfield K’ = L(y: , --- , yn) must also preserve p-independence, so that The- 
orem 15 at once gives a separating transcendence basis for K’/L. Conversely, 
suppose that each K’/L has such a basis, but that the whole extension K/L 
does not preserve p-independence, so that some p-independent subset X of L 
becomes p-dependent in K. This means that some x « K’(X’), where X’C X 
isa finite subset not containing z. Therefore xz « L’(y}, --- , y2 , X’) for suit- 
able elements y;, --- , yn in K. This result states that the original set X has 


** This number is an invariant of the extension K/L. Mac Lane [7], Theorem 6. 








386 SAUNDERS MAC LANE 


already become p-dependent in the field K’ = L(y, --- , yn), contrary to the 
hypothesis that every subfield with such a finite generation has a separating 
transcendence basis and hence preserves p-independence. 

The distinction between extensions preserving p-independence and extensions 
with separating transcendence bases arises only for extensions with infinitely 
many generators, as for instance in the extension K = L(x” *, x? *, ---) (where 
x is transcendental over L), which preserves p-independence but which, ac- 
cording to Theorem 14, does not have over L a separating transcendence basis. 


8. Intermediate fields. The question next to be considered is this: If an 
extension K/M has a separating transcendence basis, and if L is a field between 
K and M, under what circumstances does L have a separating transcendence 
basis over M? Our answer, though dependent on the previous analyses of 
p-independence, can be stated independently of that notion. 


THEeoreM 17. [If the fields K D> L > M are such that K has a finite separating 
transcendence basis over M, then L also has a (finite) separating transcendence 
basis over M. 

Proof. Certainly K/M and thus L/M preserves p-independence (Theorem 
9(a)). Pick transcendence bases X and Y respectively for K/L and L/M; 
X U Y is then a transcendence basis for K/M. By the necessary condition 
(Theorem 14) for the existence of a separating basis, M(K”’) is separable over 
M(X, Y) for some e. Therefore M(L”’) is separable over M(X, Y). The 
adjunction of the indeterminates X cannot reduce any equations irreducible 
over M(Y), so M(L””) is also separable over M(Y). This is the sufficient con- 
dition of Theorem 14 for the existence of a separating transcendence basis 
for L/M. 

Example. This conclusion could not be asserted were the transcendence 
degree of K/M infinite, even if we restrict M to be a perfect field P. For let 


T = {t,t,--- } be a set of elements algebraically independent over P, and 
define another set Y = |y2, ys, --- } successively by the inseparable equations 
(1) yn = tne + tnalh (n = 2, 3, 4, --:). 


Then the field L = P(T, Y) does not have a separating transcendence basis” 
over P. Nevertheless L can be embedded in a larger field K = L(T””) = 
P(T, Y, T"”) = P(T””) which does have the separating transcendence basis 
T’” over P. This counter-example depends essentially on the fact that the 
intermediate field Z also has an infinite transcendence degree. 

Corotuary. If the fields K > L > M are such that K has a separating tran- 
scendence basis S over M and L has a finite transcendence degree over M, then L 
has a separating transcendence basis over M. 

*t Proof given in Mac Lane [5], Lemma 8.5, where the present L appears as a field Si, 
and where it is shown that L has a separating transcendence basis neither over P not 
over P(t). 

















the 
ting 


ions 
itely 
here 
ace 
Asis. 


if an 
ween 
lence 
es of 


ating 
dence 


orem 
L/M; 
lition 
over 
The 
icible 
t con- 
basis 


dence 
‘or let 
> and 
ations 


basis” 
“ = 
. basis 
at the 


g tran- 
then L 


eld Si, 
_ P nor 











MODULAR FIELDS 387 


Proof. Since L/M has a finite transcendence basis, all elements of this basis 
are algebraic over a finite subset Sp of S. Hence L is contained in the field Ko 
composed of those elements of K algebraic over M(S»). For this field, Sp is a 
finite separating transcendence basis, so Theorem 17 applies to Ky D L D M. 

The question of a separating basis for K itself over the intermediate field L 
may now be considered. 


THeorEM 18. Jf K DL D M, where K has a finite separating transcendence 
basis over M, K has a separating transcendence basis over L if and only if K/L 
preserves p-independence. 


Proof. The necessity of this condition is known (Theorem 9(a)). Con- 
versely, suppose that K/L does preserve p-independence, that 7 is any tran- 
scendence basis for K/L, and that S is the given separating basis for K/M. 
Because S is finite, there is an integer e such that S” is separable over L(7). 
Then L(K”’) is separable over L(7), so that K/L must have a separating tran- 
scendence basis by the fundamental criterion of Theorem 14. 

It is impossible that a field K have a separating transcendence basis over 
any subfield smaller than its maximal perfect subfield, as one sees by the follow- 
ing result. 


THEOREM 19. Ifa field K has a separating transcendence basis over a subfield L, 
then L contains the maximal perfect subfield of K. 


Proof. Let K”” denote the maximal perfect subfield of K. If the conclusion 
were false, there would be a b in K”” but not in L. All roots b” “ lie in the 
perfect field K’” C K, so they all are separable over the extension L(7’) ob- 
tained from the given separating transcendence basis T for K/L. But y = b 
is the root of some equation g(y, 7) = 0 irreducible in the polynomial ring 
Lly, T]. Let t be a variable of T whose exponent p‘ in g is as small as possible, 
so that no other ¢; of 7 has exponent less than p‘, and let JT) = 7’ — {t} be the 
set T with ¢ deleted. Then g(y, 7) = g(y, t, To) is an irreducible separable 
equation for y = b over L(t”, T?*), with exponent 1 in the variable ¢”’. It 
follows (lemma below) that b is not separable over the smaller field L(t” ”', T?’). 

There is therefore an integer m and a corresponding set S = 7°” of transcend- 
ents such that all the roots b” " are separable over L(S), while not all the roots 
b”” are separable over L(S”). Let ¢ = b” “ be one such inseparable root, so 
that ¢ itself is not separable over L(S”), although c” ° is separable over L(S). 
If we apply the isomorphism a + a’, it follows that c is separable over L?(S”), 
hence over the larger field L(S”), in contradiction to the choice of c. This 
contradiction establishes the theorem. 

The lemma as to the inseparability of b used above is 


Lemma 5. If a subset T of a field K is algebraically independent over a sub- 
field L, and if an element b of K is a separable root y = b of a polynomial g(y, T) 
irreducible in the polynomial ring L{y, 1), then a variable t of T can appear in g 
with exponent 1 if and only if b is inseparable over L(t”, T — {t}). 













388 SAUNDERS MAC LANE 





The simple proof given in Mac Lane [5], Lemma II, §2, for the case whe 
L is perfect, applies equally well for any field L. Easy consequences of the 
theorem are the following: 

Corotuary 1. Ifa field K is obtained froma field L by successive transcendental 
and separable algebraic extensions, then the maximal perfect subfield of K is th 
maximal perfect subfield of L. 

Corouiary 2. If the elements of a set T are algebraically independent over 
field L, then the intersection of all the fields L(T”’), for e = 1, 2, ---, is exactly 
the field L. 


















9. Pure forms. The notion of a p-basis of a field can be used to show thai 
Albert’s results on pure null forms over a function field are in essence valid 
over an arbitrary coefficient field. We remark first that the degree of imper 
fection of a field K has been defined (Teichmiiller {10]) to be the number @ 
elements in a p-basis of K. A pure form f of degree q over K, 


f(a, «++, tm) = dizi + --- + dy zh, (b; in Ri, 







is a null form over K if f(zw,--- ,2mo) = O for values zy, --- , mo not al 
zero in K. 
THEOREM 20. Every pure form f(x, --- , 2m) of degree q = p* over a field i 
of characteristic p is a null form if the number of variables m exceeds q', where! 
is the degree of imperfection of K. However, there exist non-null forms over I 
for everym Sq’. 
The proof is exactly similar to that given by Albert ({2], Theorem 6). 
We also prove here a property of p-bases which we have used elsewhere with 
out proof ((6], Lemma 3). 
TuHeoreM 21. If X and Y are disjoint subsets of a field K such that X UJ} 
is p-independent in K, and if K(X” ~) is obtained from K by adjoining all rod 
x” *, for x in X and e an integer, then the set Y is p-independent in K(X”~ 
Furthermore, if X U Y is a p-basis of K, then Y is a p-basis of K(X” *). 
Proof. If Y is not p-independent in K(X” ~), there is an element y in! 
p-dependent on a finite subset Yo of Y not containing y. Hence for som 
integer e and some finite subset X) C X, y « K’(X? “, Yo), soy” «K” (Xo,Yy 
This will lead to a contradiction on the degree of the field L = K”*”'(Xo, Yo,1 
over K””’. For, on the one hand, the p-independence of X U Y in K meajj 
that X), Yo, and y together have degree p**”*' over K”, where £ and 7 respé 
tively denote the number of elements in Xo and Y,. Hence by an inducti0 
one finds 










































(L:K?"’) = (K”"'(Xe, Yo, 9) :K” '] = pO. 


ye+l 


On the other hand, y” «K”” (Xo, Yo), so that if we adjoin first all the elemet! 
of X, and Y,, then finally the element y, we get for the same extension a degr 














ow thai 
se valid 


) 


> 


MODULAR FIELDS 389 


(+n) (e+1) +e a 7° ° e 
not more than p ,acontradiction. This establishes the p-independence 
ye , r x, . . . . . rt. 
of Y in K(X” _); that it becomes a p-basis is readily verified from the definition. 


10. Fields without separating transcendence bases. We now give a counter- 
example to the possible extension of Theorem 4 of the introduction; in that we 
show: 

(i) There is a field M which does not have a separating transcendence basis 
over its maximal perfect subfield P, but which does have a finite transcendence 
degree ¢ over that field P. Here ¢ may be any specified integer ¢ = 2. 

This example will also show: 

(ii) The number of elements in a p-basis of a field K (the so-called degree of 
imperfection of K) is not always equal to the transcendence degree of K over its 
maximal perfect subfield. 

Teichmiiller, in [10], has also proved (ii) by the example in which K is a field 
of formal power series, where the transcendence degree in question is infinite. 
Our example yields a field K in which both the transcendence degree and the 
degree of imperfection are finite. 

Let P be a perfect field and Z = {2 , 22, --- }, adenumerable set of quantities 
algebraically independent over P. Denote by P(Z’ ~) the perfect field 


2 


(1) P(Z”") = P(Z,Z" ,Z” ,-::). 


Let y and uw be algebraically independent over P(Z” ~), and define quantities 
u, recursively by 


1 


(2) Un = yy” + Zn Ue (n = 1,2, --- 
The field which we use as an example is then 


~% 1 1/p? 1/ p” 
M = P(Z” “, y, wo, ui”, ue”, +, Un, oe). 


By (2), uP?" can be expressed in terms of uP ", so M is the union of a tower 
of fields M, C M, C M,C... , where 


M, = P(Z” *,y, ur”) (n = 0, 1,2, -+-). 


From equations (2) we observe that 
(3) P(z, ” Bgisile » 2n ’ y, Uo) = P(x 9 °*% 9 Sn ’ y,; Un) 


and hence that 2; , --- , Zn, y, Un are algebraically independent over P. Further- 
more 


(4) M, = P(Z” “, y, un), M, = Mu? “), 


n 


so that M,, is a purely inseparable extension of Mo of degree p”. 
the necessary condition of Theorem 6 applied to the transcendence basis T = 
\y, Wo} for M proves the following result : 


Therefore 


Lemma 6. The field M does not have a separating transcendence basis over 


P(Z’ *). 








390 SAUNDERS MAC LANE 


ome lip ‘ . : 
The extended field M(y'”) would by the equations (2) also contain each 
u2_; , so that we can assert 

LemMaA 7. The field M has a p-basis consisting of one element y and a tran- 
scendence basis {y, uo} over the perfect subfield P(Z” ~). 

Thus our example has the properties (i) and (ii) stated above, provided we 

yp Sy « . rm™.: 

can prove that P(Z’ _) is the maximal perfect subfield of M. This we now do, 









Lemma 8. An element b of M is in M, if and only if b”’ is separable over My. 





Proof. By (4), all elements b of M, certainly have the property stated. 
Conversely, suppose b”” separable over My , and take k = nso large that b « M, 
= M,(ut ‘) In such a field M;, it is known that any element 6 with b” sepa- 
rable over the base field My must be” in M,(u? "). If k > n, (2) yields uf 
= y” “4 2P “ub, hence My(u? ") = Mo(ug_1+). Therefore, by induction, 
be Mo(u2 ©) = M,,, as asserted. 

Consider now any element a in the maximal perfect subfield M’” of M. The 
expansion for a in terms of the generators of M can involve but a finite number 
of u’s, but a finite number of z’s, and but a finite number of p"-th roots of z’s. 
Hence we can find a power c = a”, also in M””, and an integer m such that 
ce P(Z», y, uo), where Z,, is a finite subset 


(5) 


For each ce, c = c? for some element c, of M, while by Lemma 8,c,¢«M,. Thus 















Za = oe ~ 2 













(6) ceP(Zm,y, Ww),  ceM? (e = 0, 1, 2, ---). 





This situation will now be simplified by showing that only a finite number 
of z’s need be used in these fields M,, provided e => m. Note that M?" = 


P(Z” *, y”’, ue). 

For each e 2 m pick the smallest n such that the first » z’s suffice to make 
ce M?’:i.e., such that ce P(Z2 ”, y”, u.). Ifn > e, c is not in the field R = 
P(Z2_5, y’, u.), but is in R(z? ‘) for some t. As in the remark following (3), 


z, is transcendental over P(Z,_1, y, ue), hence over R. Thus c in the purely 
. a . 
transcendental extension R(z? ) must itself be transcendental over R. On the 


other hand, (6) and (4) state that 
ceP(Zn, y, vw) C P(Z., y, we) = P(Z., y, Ue) © P(Zn-1, y, Ue), 










and the last of these fields is algebraic over R = P(Z?_;, y”’, u.). Therefore 


c is algebraic over R, a contradiction from which we conclude that 


, 2 ye 
ceP(Z? ,y”, te) (e = m,m + 1, -::}. 











If the expression for c in terms of these generators actually involves some roots 
a 


of Z,, pick s > 0 so small that ¢ is in N* = P(Z? *, y’’, u.) but not in N = 







© 
} 


22 Mac Lane [5], Lemma 6.1: /f u is transcendental over F,, then the elements b of F(u” 
with b?* separable over F(u) all lie in F(u” ge 










each 
tran- 


dl we 
w do. 


. M, : 
tated. 


ction, 


The 
umber 
of 2’s. 
1 that 


= 
umber 
uP’ = 


» make 
dR= 
ng (3), 
purely 
(on the 


), 


erefore 


<a 


1e roots 
in N= 


MODULAR FIELDS 391 


p-atl e _ ° = - e ° y y 
P(Z: , y’, u-). The extension N*/N is given by a tower N C N, CN, 
Cc... C N*, where 

No = N, N; = N(e?’, ---, 2? *) (¢ = 0,---,e). 
For some 7, ¢ is in N; but not in Ni4. Since N*” C N, c is purely insep- 
r .- a a a —s 
arable over N,_,, and the two extensions N;:(c) and N; = N,i(z? ), each of 
ry > s T 
degree p, must be equal. Therefore z? ¢Nj.(c). But now set F = P((Z — 
x , perl ° 
>, Y, Ue), so that Nyy C F(z ), while by (6), ce P(Zm, y, uo) C 
stl ‘ _—s 7 pa7eri ss f ‘ Pa 
). Therefore z?  ¢ F(z} ). This conclusion is a contradiction be- 
cause z, is a quantity transcendental over F. Hence we have s = 0, and 


(7) ce P(Z., y”’, Ue) (e=m,m+1,--- ). 


Combining (6) with (7), we next aim to prove that c is in each of the fields 


(8) | = P(Z.., y””, ue ~— (¢ = m,m a . ee -), 


Note that D,. is the field P(Z., y”’, ue) of (7). A preliminary is 


LemMa 9. Each D.» with e 2 m is relatively algebraically closed (see footnote 
13) in Dee . 
Proof. First simplify the notation of (8) thus: 


(9) Dien = P(Zn, y’, v), of = yew 
(10) a ae ee ee 

In terms of these quantities u and v the defining equation (2) becomes 
(11) ee ee 


Hence D., C Dns; , according to (9) and (10), and we can go from D,, to D., 
by a tower 


Dea G Dens Dens2 . i Nac = Do» 


For the lemma it therefore suffices to prove D,, relatively algebraically closed 
in Denys for alln < e. 

By (9) and (10) Densi = D.n(z, u), where, as in (8), z is transcendental over 
D.., while u is inseparable and algebraic over D,,,(z) in accord with (11). If 
D., is not relatively algebraically closed in D.n.1, pick 6b in Dens: — Den and 
algebraic over D,,. Then b” is in D,,(z) and is algebraic over D,, , so b” is in 
D.,. Therefore b is purely inseparable over D,, and also over D,»(z), and we 
must have D.ni; = D.n(z, b). Therefore u is a rational function g(z)/h(z) of z 
with coefficients in D = D,,(b). We can assume that g(0) and A(0) are not 
both 0. This value of u in the defining equation (11) for u gives an identity 


(12) [h(z)}? ly’ + 2”° “v) = [g(z)]” 


inzover D. By setting z = 0, we find y’ = [g(0)/h(0)]’, with neither h(0) nor 
9(0) zero. This means that y’ is in D”, hence that (y’)'” is in D. A similar 





392 SAUNDERS MAC LANE 


argument on the terms of highest degree in (12) proves that v’” « D. However, 
in D,, of (9), Z, , y’ and v are algebraically independent over P, so that y’, v are 
p-independent in D,.,. This means that the extension D,.,((y’)"”, v'”) has 
degree p’ over D,, , although we have just shown this extension to be contained 
in D, of degree p over D.,,. This contradiction establishes the desired relative 
algebraic closure. 


Lemma 10. Fore = m,ce Den. 


Proof. By (6), ce P(Zm, y, vw) = P(Zm, y, Um), a field which is certainly 
algebraic over D.,, of (8). On the other hand ce D,. = P(Z., y’’, ue), by (7), 
and this field by the previous lemma contains no elements algebraic over D,, 
except the elements of D,,, themselves. Hence we get the conclusion.” 

If we put L = P(Z,,), » = e — m, Lemma 10 states that c is in eavh of the 
fields Din = L((y”’)””, uz"). Here y”” and u, are algebraically independent 
over L, so the intersection of all these fields, for «4 = 1, 2, --- , is known by 
Corollary 2 to Theorem 19 of §8 to be L itself. Therefore c « P(Z,,), and the 
original element a = c” ‘ of the maximal perfect subfield is therefore in P(Z? *) 
We have thus completed our example by proving 


LemMA 11. The field M has the maximal perfect subfield P(Z’ “). 


This field M thus has a transcendence degree 2 over its maximal perfect sub- 
field. A field with analogous properties but with any desired transcendence 
degree t = 2 over its maximal perfect subfield, is, by Corollary 1 of Theorem 19, 


the field M* = M(T), where T is a set of t — 2 elements algebraically inde 
pendent over M. 


BIBLIOGRAPHY 


1. A. A. ALBERT, Modern Higher Algebra, Chicago, 1937. 
2. A. A. ALBERT, Quadratic null forms over a function field, Annals of Mathematies, vol. 
39(1938), pp. 494-505. , 
3. M. Becker aNnp S. Mac Lang, The minimum number of generators for inseparable alge- 
braic extensions. To be published. 
. H. Hasse anp F. K. Scumipt, Die Struktur diskret bewerteter Kérper, Journal fiir dit 
Mathematik, vol. 170(1934), pp. 4-63. 
. 8S. Mac Lang, Steinitz field towers for modular fields. Forthcoming in the Transactions 
of the American Mathematical Society. 
. S. Mac Lang, Subfields and automorphism groups of p-adic fields, Annals of Mathe- 
matics, vol. 40(1939), pp. 423-442. 
. S. Mac Lang, A lattice formulation for transcendence degrees and p-bases, this Journal, 
vol. 4(1938), pp. 455-468. 
. F. K. Scumipt, Allgemeine Kérper im Gebiet der héheren Kongruenzen, Dissertation, 
Erlangen, 1925. 
9. E. Srernirz, Algebraische Theorie der Kérper, edited by R. Baer and H. Hasse, Berlin, 
1930. 


23 The arguments of Lemmas 8 and 9 are the crux of this example. As given, they de 

pend essentially upon the algebraic independence of the z’s. This is the inner reasol 
. . » on~@ ° 

for the complicated structure of the maximal perfect field P(Z” —) used for this example. 








wever, 
y’, v are 

*) has 
ntained 





relative 


rtainly 
by (7), 
rer D,, 


| of the 
endent 
ywn by 





nd the 
(Ze *) 


ct sub- 
ndence 
em 19, 
y inde. 


cs, vol. 
ble alge- 
fiir die 


actions 





Mathe- 
ournal 
rtation 
Berlin 
hey de- 


reasol 
cample. 





MODULAR FIELDS 393 


10. O. TEICHMULLER, p-Algebren, Deutsche Mathematik, vol. 1(1936), pp. 362-388. 
. O. TEICHMULLER, Diskret bewertete perfekte Kérper mit unvollkommenen Restklassen- 
kérper, Journal fiir die Mathematik, vol. 176(1937), pp. 126-140. 


ll 


12 
13 


2. B. L. vAN DER WAERDEN, Moderne Algebra, vol. I, first edition, Berlin, 1930. 


. B. L. VAN DER WaERDEN, Zur algebraischen Geometrie XIV, Schnittpunktszahlen von 
algebraischen Mannigfaltigkeiten, Mathematische Annalen, vol. 115(1938), pp. 619 


644. . 


HARVARD UNIVERSITY. 





OSCILLATING FUNCTIONS 
By R. P. Boas, Jr. 


1. Introduction. We may say that a function f(x) is monotonic at the point 
zo if there is a positive 6 such that, whenever rz — 6 < 2 S % S 22 < 1 +4, 
either f(z) S f(xzo) S f(x2) or f(ai1) = f(xo) = f(x2); a function monotonic at 2% 
is not necessarily monotonic in any interval containing x». There are then 
several senses in which a continuous function f(z) may be said to be every- 
where oscillating: f(z) may be monotonic in no interval, almost nowhere mono- 
tonic (i.e., monotonic at most at the points of a set of measure zero), monotonic 
at most at the points of a countable set, or monotonic at no point. The most 
natural questions of the existence of functions monotonic in no interval, be- 
longing to more or less restricted classes, are settled by the functions con- 
structed by P. Képeke and A. Denjoy,’ which are monotonic in no interval, 
and not only absolutely continuous, but differentiable at every point, with 
bounded derivatives. P. Hartman and R. Kershner’ have recently given a 
simple construction of an absolutely continuous function which is monotonic 
in no interval. 

It is evident that an absolutely continuous function cannot be almost nowhere 
monotonic, since it is surely monotonic at the points of the set where its de- 
rivative is not zero. Similarly, it is clear that a function which almost every- 
where fails to have a finite derivative is almost nowhere monotonic, since by a 
well known theorem,’ such a function will almost everywhere have one of its 
upper Dini derivatives + ©, and one of its lower Dini derivatives — ~. These 
considerations tell us nothing about the existence of a continuous function of 
bounded variation, almost nowhere monotonic; in this note such a function 
will be constructed. A continuous function f(z) of bounded variation must, 
however, be monotonic at the points of an uncountable set.‘ For, let the 
curve y = f(x) (0 S x S 1) have the parametric representation x = 2(s), 
y = y(s) (0 S$ s S$ 1,1 > 1), where sis the arc length. Then’ z’(s)’ + y’(s)’ = 1 





Received January 16, 1939. The author is a National Research Fellow. 

1A. Denjoy, Sur les fonctions dérivées sommables, Bulletin de la Société Mathématique 
de France, vol. 43(1915); pp. 161-248; pp. 210 ff. Denjoy gives a critique of Képcke’s 
construction (pp. 228 ff.). 

? P. Hartman and R. Kershner, The structure of monotone functions, American Journal 
of Mathematics, vol. 59(1937), pp. 809-822; p. 817. 

3 E. W. Hobson, The Theory of Functions of a Real Variable and the Theory of Fourier’s 
Series, vol. 1, 1927, p. 400. 

‘ I am indebted to A. P. Morse for this remark. 

5 See, e.g., F. Riesz, Sur l’existence de la dérivée des fonctions monotones et sur quelques 
problémes qui s’y rattachent, Acta Litterarum ac Scientiarum Regiae Universitatis Hun- 
garicae Francisco-Josephinae, Sectio Scientiarum Mathematicarum [Szeged], vol. 5(1930- 
32), pp. 208-221; p. 216. 


394 











/ i an. a a a eT 


fi 
e 








~~ js cv 








OSCILLATING FUNCTIONS 395 


for almost all s. Moreover, y’(s) # 0 for a set of values s of positive measure, 
since otherwise we should have z’(s) = 1 for almost all s, and therefore 1 = 


I 
| a'(s) ds S x(l) — x(0) = 1. Hence, for a set of values of s of positive meas- 
0 

ure, and consequently for an uncountable set of values of z, f’(x) = y’(s)/x’(s) 


# 0; when f’(x) ¥ 0, f(x) is evidently monotonic at 2» . 

It is reasonable to suppose that “‘most”’ functions are everywhere oscillating. 
We shall make this statement precise in terms of the category of various classes 
of everywhere oscillating functions in various spaces. Again, several state- 
ments are trivial consequences of known results. Thus from a theorem of J. C. 
Oxtoby® it follows immediately that functions monotonic in no interval form 
a residual G; set’ in the space AC of absolutely continuous functions. Since 
nowhere-differentiable functions form a residual set in the spaces H* of functions 
satisfying Lipschitz conditions of order a (0 < a < 1),° the nowhere monotonic 
functions of these spaces also form residual sets. 

We shall show that in the space CBV of functions of bounded variation the 
functions monotonic in no interval form a residual G; set; the method could 
easily be adapted to establish the same result for the spaces AC and H“ as well. 
A simpler, but closely related, result is that the set of functions of CBV, mono- 
tonic at a specified point, is of the first category. 

Let R be the space of integrals of essentially bounded functions on (0, 1), 
vanishing at the origin, with the natural norm’ || f(x) || = sup®|f’(x)|. It is 
clear that the set O of elements of R, monotonic in no interval, is not residual; 
neither is its complement (once we know that O is not empty). We can, how- 
ever, by using a different metric, establish by a simple category argument the 
existence of a function, monotonic in no interval, with a bounded derivative; 
regarded as a construction, this of course establishes less than the constructions 
of Képcke and Denjoy. 


2. The oscillating functions of CBV. The elements of CBV are continuous 
functions z(t) of bounded variation on (0, 1), normalized by the condition 


1 
z(0) = 0, and with norm || z|| = [ | dx(t)|. With the obvious definitions 
0 


*J. C. Oxtoby, The category and Borel class of certain subsets of &,, Bulletin of the 
American Mathematical Society, vol. 43(1937), pp. 245-248; Theorem 5. 

‘A set of the first category is the sum of a countable number of nowhere dense sets. 
A residual set is the complement of a set of the first category. A G; set is the intersection 
of a countable number of open sets; an F, is the complement of a G; . 

*H. Auerbach and S. Banach, Uber die Héldersche Bedingung, Studia Mathematica, 
vol. 3(1931), pp. 180-188. 

* The superscript ° attached to a symbol indicates the disregard of certain exceptional 
sets of measure zero. Thus sup® means ‘essential least upper bound’”’ (= ‘“‘true max’’, 
“ess. sup.”’, etc.); =° means “equals almost everywhere”; ete. (Notation suggested by 
F. Smithies. ) 

*° It is not even everywhere dense. 


pester aati 














396 R. P. BOAS, JR. 


of the operations in the space, CBV is a Banach space. Let O be the set of 
elements of CBV, monotonic in no subinterval of (0, 1). 

THEOREM 1. The set O is a residual G; set in CBV. 

Let C(O) be the complement of O. Let {J,} be the sequence of subintervals 
of (0, 1) with rational endpoints, and denote by £, the set of elements of CBV 
which are monotonic in J, ."" Evidently each E, is closed (since if x, — z in 
CBV, x,(t) — x(t) uniformly in 0 S$ ¢ S 1). Moreover, C(O) = > s: & 

n=l 
establish Theorem 1, then, we have to show that each E, is nowhere dense. 

Since each £, is closed, it is enough to show that for any re CBV, and 9 
(0 < » < 1), we can construct y « C(E,) such that || y — z || < 7; for, this 
shows that C(£,) is everywhere dense, and a closed set with a dense complement 
is nowhere dense. 

We introduce an auxiliary function z(t; r, y, 8) defined forO St S : 0s 
<++y¥ < 1,8 > 0 to be continuous in (0, 1); zero in (0, r) and (7 + y, 
B at t = ++ 4y; and linear in (7, + + }y) and in (rt + }y, r + 
Then z(t; 7, y, 8) « CBV and || z || = 28. 

If x(t) is not monotonic in J, , we take y(t) = z(t). 

If x(t) is constant in J, , set y(t) = x(t) + 2(t; A, 6, 5), where A is the midpoint 
of I,,, 6 < $n, and 26 is less than the length of J,. Then | y — x!|| = 26 < 9, 
and y(t) is not monotonic in J, . 

If x(t) is monotonic and not constant in J, = (t’, t’’), there is a point s 2 f’ 
such that z(t) = z(t’) in (t’, s), but z(t) ¥ z(t’) fors <t St”. Wetake a number 
A > 0 and a point 7 such that s < 7 < t” and|2’(r)| < A. We set y(t) = 
x(t) + 2(t; r, 26, 346), where 6A5 < n, 7 + 26 S 1, and the upper or lower sign 
is taken according as z(t) is increasing or decreasing” in I, . 

Then || y — z|| = 6Aé < ». Suppose, for definiteness, that x(t) increases 
inJ,. Then 2’(r) < A, and hence, for sufficiently small positive ¢, z(r + §) 
—a(r) < 2A¢. For0 <¢ < 6,2(7 + £; 7, 26, 348) — 2(7; 7, 26, 345) = 3Af. 
Therefore y(r + £) — y(r) < 2A¢ — 3A <0. On the other hand, y(t) = z(t) 
in (s, 7), and z(t) increases there and is not constant. Hence y(¢) is not mono- 
tonic in J,. Similar reasoning applies if x(t) decreases in J, ; this completes 
the proof of Theorem 1. 


“D: 
7). 


3. Functions monotonic at a particular point. Let us say that x(¢) increases 
on the right at é& if there is a positive 6 such that z(t) < z(t) when & St S 
to + 6. We have the following theorem. 

THEeorEM 2. For any to, 0 < t < 1, the sets I(t), D(to) of elements of CBV 
which, respectively, increase or decrease on the right at to are F, sets of first category. 

1! This choice of sets E,, was suggested by the referee, and leads to a considerable sim- 
plification of my original proof. 


12 By “‘increasing’’ we mean, throughout, “increasing in the weak sense’’ (‘‘non-de- 
creasing’’). Similarly for ‘“‘decreasing’’. 











of 








OSCILLATING FUNCTIONS 397 


Theorem 2 is also true if CBV is replaced by AC or H* (0 < @ < 1); the proof 
requires only slight modifications. 
Theorem 2 leads to a simplified proof of the category result of Theorem 1. 


Let {t,} (nm = 1, 2,---) be a countable set of points dense in (0, 1). Let 
H,, Hz, --- denote the intervals of the countable set (k/m, (k + 1)/m) 
(m= 1, 2,--- ;k =0,1,---,m— 1). If xeC(O), x is monotonic in some 


H,, and consequently increasing or decreasing on the right at every point ¢, 
in this H,. We therefore have 


cocd II tt) + Dw}, 


k=1 t,he Hy 
so that C(O) is a set of first category. 
We now prove Theorem 2 for I(t). We have I(t) = > £,, where E,, 


n=1 


is the set of x such that x(t) — x(t) = Oforte St <t +n '. Evidently each 
E, is closed, and Theorem 2 follows if we show that the complement of each E,, 
is everywhere dense; for then each E, is nowhere dense. 

Let z be any element of CBV, and let 7 > 0 be arbitrary. We construct, 
for each n, ye C(E,) with || y — x|| S n. To do this, choose « > 0 so that 
2 <n ‘and x(t) S x(t) + 4n fort St S tp + Qe. Thenif y(t) = x(t) — 


a(t; to, 2e, 3), || y — r || = a; and y(t + €) = r(to + ©) — 30 < 2(b) = y(t), 
so that y « C(E,). 


4. Oscillating functions with bounded derivatives. Let S be the space of 
measurable functions x(t) (0 S ¢ S 1) such that sup® | z(t) | S 1; Sis a complete 
1 


metric space under the “L metric” (x, y) = [ | a(t) — y(t) | dt. Now let T 
0 


be the complete metric space whose elements are absolutely continuous functions 
z(t) with 2(0) = 0 and r’(t) e S, and with the distance between z and y defined 
as (z’,y’). (Note that the completeness of T is ensured by the unztform essential 
boundedness of the x’(t).) Then there is an obvious isometric relation between 
T and S in which the set O of functions of 7, monotonic in no interval, cor- 
responds to the set of functions of S which are not of essentially fixed sign in any 
interval. 

We shall prove 

THEOREM 3. The set O is a residual G; set in T (and consequently not empty’). 

Let {Z,} be the sequence of “rational intervals”’, as in the proof of Theorem 1, 
and let E, be the set of x « S such that, in J, , x(t) 2° 0 or x(t) $° 0. Then 


each £, is closed. 
Consider any xe E,, and any 7 > 0. If x(t) =° O in J, set y(t) = x(t) if 


8 By Baire’s theorem: A complete metric space is not of the first category. (See, e.g., 
C. Kuratowski, Topologie I, 1933, pp. 204-205.) 











398 R. P. BOAS, JR. 


tis not in J, ; and y(t) = »/d in half of J, , y(t) = —7/A in the other half, where 
d is the length of J, ; then y « C(E,) and || y — x || = 7. 

If z(t) >° Oorz(t) <°Oin J, , take a set J C J, , of measure ¢ (0 < € < $n), 
such that | z(t) | >° Oin J andin J, — J; and define y(t) = x(é) ift « (0,1) — J, 
y(t) = —2(t)ifte J. ThenyeC(£,); and | y —27|| S 2e < 7. 

Thus £, is closed and contained in the closure of its complement, and is conse- 
quently nowhere dense. Hence > E, is an F, set of first category, and Theo- 

n=l 


rem 3 follows. 


5. An oscillating function of CBV. Our aim in this section is to construct 
an almost nowhere monotonic function of CBV; the function which we shall 
construct will actually have the property that, except at the points of a set of 
measure zero, it is monotonic neither on the right nor on the left. 

We denote by | Z| the measure of the set EZ. Let h.(x) be the function 
which increases on (0, 1) from 0 to a > 0, and which is constant on each com- 
plementary interval of the Cantor ternary set (of measure zero). Let g.(z) 
denote the function defined as 


ha(x), ®a2z3 } 

h.(2 — 2), . 2 2s & 2. 
—h,(x — 2), 24283 
—h.(4 — 2), $3273 4 
h(x — 4), 4c 5 
h.(6 — x), 532756 


Thus g.(x) increases from 0 to a on (0, 1), decreases to —a@ on (1, 3), increases 
to a on (3, 5), and decreases to 0 on (5, 6). We shall construct our function 
f(z) (on (0, 6) instead of on (0, 1)) by iterating a process of inserting suitably 
reduced copies of functions g,(x) in the intervals in which g;(z) is constant. 

Let «, > 0, = 1, D0 «. < %. We proceed to define a sequence of fune- 

n=l 

tions f,(x). 

We set fo(x) = 0, and fi(x) = gi(z). Then fi(x) is constant in each (open) 
interval of a countable set of intervals J(1, k) (k = 1, 2, ---); the closed set & 
complementary to the sum of the 7(1, k) has measure zero. We now define 


f(z) by induction. 
Suppose that f(r) has been constructed for m < n — 1 (n 2 2); that fn(z) 
is constant in each interval of a set {I(m, k)}, where the closed set Z,, comple- 


wo 
mentary to > I(m, k) has measure zero; and that each I(m, k) is contained 
k=1 


in some I(m — 1,7). Suppose further that f(2) > fm—(z) in the first and third 








Fe 


say 
I(m 
two 
two 
tint 


iC- 











OSCILLATING FUNCTIONS 399 


thirds of each Z(m — 1, &), while f(x) < fm-is(x) in the middle third of each 
I(m — 1, k). 

We set f(z) = fn-r(x) for re E,1. Let I(n — 1, k) = (ay, by), and let 
fra(z) = in J(n — 1, k). We enumerate the intervals I(n — 1, k) by the 
rules that of two intervals of unequal length, the longer precedes; and of two 
intervals of equal length, the left-hand one precedes. Suppose that 


In-1,r—1)|/>[|I(n -1,r)| =|Im—-1,r+1)| = 
--» =|Im—-—1,8)|=6>|(Im—-—1,84+1)|, 
Forr Sk S 8, we set 


frl(x) = ga (67 * ) +r, zeI(n — 1, k), 


eo) 
where a << m(kK =r,r+1,---,8); pa nm < €,; and @ is so small that if 
k=1 


I(n — 1, &) is in the first or third third of the J(m — 2, 7) which contains it, 
the minimum in J(n — 1, k) of f,(x) is greater than the (constant) value of 
fr-o(z) in I(n — 2, 7), while if J(m — 1, k) is in the middle third of I(n — 2, 9), 
the maximum in J(n — 1, &) of f,(z) is less than the value of f,_2(x) in J(n — 2, j). 

Then f,(x) is constant in each interval I(n, k) of a set of total measure one; 
f(z) > fa(x) in the first and third thirds of each J(n — 1, k), while f,(x) 
< fr_1(x) in the middle third; and each I(n, k) is inside some J(n — 1, 7). 

We now observe some properties of the f,(z), easily established by induction. 
Each f,(z) is continuous. The total variation on (0, 6) of f,(z) — f,-:(x) is not 
greater than 6e,. In two intervals J(m — 1, k) of equal length, the graphs of 
f.(z) — fas(a) are congruent. Moreover, in an J(n — 1, k) the graph of f,,_:(z) 
isa horizontal straight line, L; the graphs of f,(xz) and f,4:(z) are above L in the 
first and third thirds of J(n — 1, k), and below L in the middle third. 


Now the sequence {f,(xz)} is evidently uniformly convergent; and since the 
ao 


total variation of f,(x) is less than 6 >. €,, the limit, f(z), isan element of CBYV. 
n=1 


we 
letE = )\ E,;then| E| =0. Ifte C(E), there is a unique set of intervals 


n=l 
I(1, ki) D I(2, kz) D --- , such that I(n, k,) +t. Except for a set F of measure 
zero, the points ¢ e C(E) have the property that if Z(1, k,) D (2, k2) D .-- 4, 
the sequences of indices k, for which ¢ is in the first, second, or third third of 
I(n, k,) are all infinite. In fact, F = >> (Fi, + F%, + F%), where F%, is the set 


m=1 
of points ¢ such that for n = m, ¢ is not in the o-th third of I(n, k,). Consider, 
say,an F),. It is contained in the set of all right-hand two-thirds of intervals 
I(m, k), ie., in a set of measure 3-6. Also, every point of F', is in a right-hand 
two-thirds of an interval J(m + 1, 7) which is in turn contained in a right-hand 
two-thirds of an I(m, k); hence F), is contained in a set of measure 3-3-6. Con- 


. ° - = " ° ° 72 y3 
tinuing in this way, we see that | F), | = 0; and similarly that | F;, | = | FP, | = 0. 











400 R. P. BOAS, JR. 


Let ¢ be an arbitrary point of G = C(£) — F;then|G@| = 1. Weshall show 
that f(z) is not monotonic on the right at ¢; a similar proof would show that f(z) 
is not monotonic on the left at ¢. 

We havet = lim I(n,k,). Let I(n,k,) = J, + K, + L, be a decomposition 


of I(n, k,) into successive thirds. Let « > 0 be arbitrary, and let mo be so large 
that 2|I(n, kn) | < €forn 2 m. For an infinite number of values n = m, 
I(n + 1, knii) C Jn ; choose such an n, and fix it. The function f,4:(2) is con- 
stant in various subintervals of K, , and at least one of these subintervals is 
of the same length as I(nm + 1, kny:); choose from these subintervals one in 
which the maximum of f,4:(z) is smallest, and call it H,,;. Our definition of 
Snii(z) was such that 

inf fasi(z) — sup fayi(z) = 0> 0. 

zeJn 2¢€Hn+1 
Now consider J(n + 2, kniz) C IT(n + 1, knit) C Jn. Hay contains some 
interval I(n + 2, 7) of length | I(n + 2, kn42) | ; call this interval Has. Since 
Sni2(x) is obtained from fys:(x) by adding congruent functions in all intervals 
I(n + 2, 7) of the same length, we have 

inf = fnso(z) — sup fnso(x) = 0. 
zeI(n+2,kn+2) ze Hn+2 

We continue in the same way, choosing H,,, (p = 1, 2, --- ) as a subinterval 
of Hasp1, 80 that | Harp | = | IT(n + p,knyp) |. The H,+, converge to a point 
t’, and 0 < t’ —t < e; we have f(t) — f(t’) 2 @>0. Foranye > 0, we can 
find such a ¢’ and @, and hence f(x) cannot increase on the right at ¢. 

We then carry out the same process, starting with a value of n such that 
I(n + 1, Knyi) C K,,, defining ¢’ as a limit of a sequence of intervals I(n + p, 
knsp) C Ln. In this case we obtain f(t) < f(t’), where 0 < t’ — ¢ S&S «, and 
consequently f(z) cannot decrease on the right at ¢. 


CAMBRIDGE, ENGLAND. 








rval 
pint 
call 


that 
rp, 
and 





A DIFFERENTIAL EQUATION FOR ORTHOGONAL POLYNOMIALS 
By J. SHOHAT 


Introduction. Any sequence of orthogonal polynomials’ (OP) {¢,(zx)} satis- 
fies, as is known, a linear homogeneous difference equation of second order 


(1) Ba(x) — (& — Cn) Pa_a(Z) + AnPn-o(x) = 0 (n=2;% =—=1,%=2-— 4), 


where A, , Cn are constants, \, > 0. On the other hand, the classical OP of 
Hermite, Laguerre and Jacobi (special cases: Legendre’s and trigonometrical 
polynomials) satisfy, in addition, a homogeneous linear differential equation 
of the following type (M, p. 33): 


(2) A®,(x) + B(x) + C,,(z) = 0, 


where A, B are polynomials in z, independent of n, of degrees not exceeding 
2 and 1, respectively, and C, is a constant depending on n. 

The importance of differential equations in the study of OP needs no further 
emphasis. Thus, it is natural to seek to find other classes of OP for which a 
differential equation of this type exists, namely: 


(3) A,®,(x) + B,®, (2) + Cr&,(x) = 0, 


where A, , B,, Cy, are polynomials in z, each of fixed degree independent of n, 
with coefficients eventually depending on n. In a note in the Comptes Rendus” 
the author has shown the existence of (3) for a certain general class of OP. 
The method employed, following Laguerre, yields rather an “existence proof” 
and is not readily applicable to the actual construction of the polynomials 
A,, B,, Ca. 

The object of the present paper is to develop a new simple method for the 
effective construction of the differential equation (3) for an extended class of 
OP, of the same general type as in the note just cited. In application to the 
classical and other OP, this method yields, as by-products, many of their proper- 
ties, old and new. 

The method used is of a very elementary character. 


Received January 23, 1939; presented to the American Mathematical Society, Decem- 
ber 30, 1938. 

1,(z) = 2®™ — Spa") + da noe™? + --+ 3 {eon(x) = a,®,(x)} is the corresponding nor- 
malized sequence. The notations used are those of my monograph: Théorie générale des 
polynomes orthogonaux de Tchebichef, Mémorial des Sciences Mathématiques, Fasc. 66, 1934 
(hereafter designated by M), to which the reader is referred for further details. 

* Jacques Chokhate, Sur une classe étendue de fractions continues algébriques et sur les 
polynomes de Tchebycheff correspondants, Comptes Rendus, vol. 191(1930), pp. 989-990. 


401 





J. SHOHAT 


1. We consider the sequence of OP 
(4) &, = &,(r) = &,(z; a, b; p), 


that is, 


/ p(x)®,,(x)®,(x)dx = 0 (m # n;m,n = 0, 1, 2, ---), 


Concerning p(x), we make the following assumptions: 
(i) p(x)(x + a)’ is finite as x — a (a finite, o fixed). (Similarly for x — b.) 
If (a, b) is infinite, say b = «, then lim x‘p(x) = 0 (k = 0,1, --- ). (Similarly 


ifa = —o.) 
(ii) p(x) is of the following form: 


‘ a B - ; po _ 
(5) p(x) = texp| [4 az, ie., A’ +A es B. 


(Naturally, we assume the existence of all integrals [ x" p(x) dx (n = 0, 1, ---).) 


Here A = A(z) and B = B(x) are certain fixed polynomials. To simplify our 
discussion, we assume that A(x) > 0, fora < x < b, i.e., A(z) has no zeros 
inside (a, 6). If A vanishes at x = a (a finite), then it is of course assumed that 
B is such that (i) is satisfied. (Similarly for z = 6.) 

From the identity 


z exp | / Pac = 40 exp | BQ oO az | 


it follows immediately that A, B in (5) may be replaced by A; = AQ, B, = 
BQ + AQ’, where Q(z) is a fixed polynomial of the same nature as A(z). We 
thus choose in (5) A so that 


(6) Ap =0 atx = a,b, 
where a, b are finite (by virtue of the first part of condition (i)). (For (a, }) 
infinite use the second part of the same condition.) Moreover, by virtue of (5), 
(7) (Ap)’ = Bp. 
In our discussion we shall also deal with the following sequence of OP: 
(8) Un = &,(2; a, b; pi), ~Pi = I(x)p(z) (n = 0, 1, 
where II(z) is a fixed polynomial, of degree q, II{x) > 0 in (a, b). 

Here, by (5), we may take 
(9) A, = AI, B, = BI + 2AM’. 

A simple application of orthogonality gives (M, p. 26) 


n>4 


(10) Hun = >, hie; 





DIFFERENTIAL EQUATION FOR ORTHOGONAL POLYNOMIALS 403 


Hereafter, Q, R and h, with various subscripts, stand respectively for polynomials 
or rational functions in z or constants, properly chosen and not necessarily the 


& 
same in different formulas; G,(z) = >_> giz‘ stands for an arbitrary polynomial, 
1=0 


of degree S s; this degree will be generally denoted by 6G, . 


We shall make constant use of the following lemma derived at once by a re- 
peated application of (1). 


LEMMA. 


(rz) = Qid.(x) + Q2,_1(z) (v 


Zz h;®; = Q:®,, + Q2?,-1 = Q3Pn- + QsPn-2 (y s n a). 


‘=? 


Moreover, in the “‘symmetric’’ case (i.e., all c, in (1) vanish), Q;, Qe, --- each 
contains only even or only odd powers of x (for so does each %,). 


It is seen that the coefficients of Q; , Qo, --- involve the c — s andA — s 
from (1). Note that in the symmetric case (10) becomes, if I(x) = II(—z2z), 


(11) Tlttn = RnsgPag 1H Rnig—2Paie—2 + RnresPnigt t--°- 
2. Write the orthogonality property of ®, as 

(12) [ p?,G,.dxr = 0 

(6) and (7) yield at once 


6 


b 
Ap®,G,, =0= [ (Ap®,G,)' dz 
(13) a a 


b b 
7 [ piB®, + A01]G, dx + [ pA®.,G, dr. 


The relation (13) is the basis of the subsequent discussion. 
Let 


(14) 


(13), through (12), leads to 


b 
(15) i PIB, + A%,]G,_,dx = 0, 


whence, 


n+p 


(16) Bo, + Ae, = D> hia,, vy = max (s, r — 1). 


t=n—r+l 











404 J. SHOHAT 


It is important to note that the summation on the right (and similar summations 
in the subsequent discussion) contains a fixed number of terms, independent of n. 
By virtue of the lemma, we write 


(16.1) Bo, + AG, = QiynP. + Qan Pri, 

(16.2) BS, + AG, = Qs nPrr + Qen Pr. 

We get further from (16.1) by differentiation and combining with (16.2) 
(17) A*Gn + AQs.n®n + AQenPn = Qrin Pn + Qs.nPn-2 - 


The desired differential equation for ®, is now obtained by eliminating ®,_,, 
®,_2 from (16.2), (17) and (1): 


I—-Cn —An ®,, 
(18) Ra ie A), + Be, = 0, 
Qin  Qan  AP®L + AQs n Bn + AQ n Pr | 
(19) A?Qo.nPr + AQ Pn + Qu nh, = 0. 


It is seen that Qu, is divisible by A, since no #, may have a common zero with 
A (the zeros of ®, lying between a and b), and the differential equation takes the 
desired form 


(20) AQ, ®, + B®), + Crab, = 0. 

If B and A have a common factor C, write, in place of (16.1) and (16.2), 
(21.1) Bi On + Ai®n = Qun®n + Qen Pra, (s, ax 4 
(21.2) Bid, + Ar®, = Qon Pra + QanPn2, C cy 


and proceed as above, with less computation; for the degrees of the polynomials 
involved will be reduced by 6C = r;. By hypothesis, 


(22) C > 0 inside (a, b). 
(15) now becomes 

(23) / Cp[Bi ®, + Ai ®,JG,_, = 0, 
so that here 

(24) ee. 2 oe 


t=—n—r+l1 
(vn; = (zx; a, b; pC); 
(25) Bi, + A:®, = Qs.ntn + Qe.nUn—i, 
(26) A,®, + (By + A®, + Biba = Qntn + Q6n0n—1 + Qsn0n + QenYn-t, 


max (r — r, — 1, s — 7); 


. , 
and we may use the known properties of v, , v, (see below). 





The 
(28), 





ith 
he 





DIFFERENTIAL EQUATION FOR ORTHOGONAL POLYNOMIALS 405 


3. The method just developed may be modified—not in principle, but in 
details—as follows. Take in (13) vy = n — max (r,s + 1) = n — o — 1, and 
we get 


b 
(27) i Ap®), Gn—e-1 = 0, 


n+r—l 
[Ae = Dd hind, 
(28) { i=n—o 


| A®, Qin Pn + Qe.nPn—i == Q3.nPn—1 + Qin Pn—s. 
We now proceed precisely as above (see (17), (18)), and we again obtain a 
differential equation for ®, of the type (20): 


(29) AQ, fn, + Bb, + Cn, = 0. 
Comparing it with (20), we conclude that 
Dn B,, c. 
(30) Q, = 7 = yy? 
Q, B,, Ca 


for otherwise 

®, sae Qn Cn = ec. 

®, Q, B, ail On B, 
This is impossible since the degree of ®, varies with n and ;, has no factor in 
common with #, , while the degrees of the polynomials on the right are fixed 
(by (28)). The relations (30) may lead to the explicit determination of the 
constants A, , Ca, --- entering into (20) and (29).° 

The case when A = A,C can be treated in the same manner as the above case: 


A = ALC, B = B,C. 


4. The same method yields the solution of the following 

Problem. Given the differential equation (3) for #,(7; a, b; p). Find a 
similar equation for u, = ®,(z; a, b; pli), as given in (8). 

Solution. First, rewrite (10) as 


(31) Ilu, = Qin Pat + Qe.nPn . 


Secondly, reasoning as above and making use of (9), we get 
b 


b 
Ai~miuaG,| =0= [ (AipiunG,) dx (p(x) = (x)p(z)), 


*The following is a still simpler variation of the same method. We get (integrate by 


b b 
pats): [ (Apon)'G,—dz — / (A pG},_,)’¢n dx = 0, whence, with » = max (r — 1, s): 


nto 
Abn + Bon = DY hin Gi = Qn-1Pn-1 + Qn-2Pn-2 = (@ = max (r — 2, 8 — 1)). 
t=n—r+1 
The desired differential equation for ®, is obtained by eliminating ®,-1, ®,—-2 from (1), 


(28), and (*). 








J. SHOHAT 





b 


b 
[ Aipr tn Gn—e,—1 dF = i All’ pu. Gy-0,-1d2 = @ 


“a 





(6A, = 71), 6B, = 8, o, = max (r; —_ l, $)), 












n—l+r+2q 


(32) AW un = Do hin® = Qn Prat + QanPn- 


i=n—¢e; 










Combining with (31), we get 
Pair = M(Rintn + Ronttn), 
@, = (Rate + Re nttn). 





(33) 






Differentiating twice, substituting into (3) and clearing fractions, we obtain 
sn? , , 
(34) Qs.nln + Qe.ntln + Qrntin + Qs.ntn = 0, 
”” af , 
(35) Qo.ntn. + Qio.nthn + Quintin + Qie nln = 0. 


- ° — — ” ° ° 
Finally, combine (34) and (35), so as to eliminate wu, , and the desired differ- 
ential equation follows: 


(36) Du, + E,u,, + F,u, = 0, 








where D,,, E, , F,, are polynomials in x of certain fixed degrees. 
We shall not dwell here upon possible modifications and simplifications of 






the above procedure. 







5. The general method, as exhibited in the foregoing, may be applied to the clas- 
sical orthogonal polynomials in which case it yields very readily the classical dif- 
ferential equations, also the explicit expressions for the quantities \,,, Cn, Sa, +++. 
We now turn to two new cases. 
(i) &,(z; —2#, ©; e **"), symmetric case. We take here A = 1, so that 
B = —zxz*. The above considerations yield the following results. 








(a) | p®,.G,.dx = 0. 





Proceeding as above, we obtain the desired differential equation in the deter 
minantal form (18), which we shall not write down. 
° ene ° 4 
Here we make use of the following formula, valid in the symmetric case: 


(37) Gant = — (Ae + Az + +o + sm T An = =e Goes , 














(compare coefficients in the recurrence relation (1)). 


‘ The interest of this formula lies in the fact that if we know but the second highest 


coefficient of ®, , we can find \,,, then the ‘‘normalizing coefficient’? a, = (A: Az --* An) 
then Hankel’s determinant A, = (aoa --- a,_,)~? of order n formed by the moments ai * 















hb 
/ p(x)x' dx (M, p. 13). Illustration: the polynomials of Legendre and Hermite. 


a 









fer- 


: of 


Be 


that 


ter- 





DIFFERENTIAL EQUATION FOR ORTHOGONAL POLYNOMIALS 407 


x 


(B) | p(x)(—2°,, + ©.)G,dr = 0; —27°o, +o, = —6,,3 + Rn Pasi. 


Comparing coefficients and making use of (37), we see that 
h, = Ani + An+2 + Ana3 ° 


Proceeding as above, we get a second differential equation in determinantal 


form which, if we expand and compare coefficients of 2"**, x"**, yields 
(38) 2(Az2 + 3 + “es + An+t) = — Ansel] a (An41 + Ans2)(Ans2 + An+s)], 
(39) Ans2(Ans1 + Anse + Ansa) = n + :. 


and the said equation takes the form 
M,®, — x(2°M, + 2), 
(40) + {x°(mM,, — 2ns1) — Ans2Mall — rnsr + Anse) (Anse + Anis) }Pn = 0; 
My = 2 + Ansa + Anis - 


Comparing this with the differential equation obtained in (a), we get the follow- 
ing relations for the \’s: 


(41) (re + Az + 5 + An) = NAn + An—1AnAn+1 = Ansi(n ~ l — An), 
(42) Ansa(An + An + An+2) = , 


whence, Ana < n'. We have further 


Gn1 = 0; an = [ ea" dz = 2" T[k(2n +1] (nz 0); 


aon = (2n — 3)ams. (n22); aoa = x2; 
A = a, de = =, A3 a 2.2. eee [M, pp. 9, 13]; 
a ae a 
Ai = a = 2 'r(4), Ae = aoa, As = ax(ay — a3), 


The above values for A; , Ax , As , combined with (42), enable us to compute A, 
(the only parameter entering into the differential equation (40)) for any given n.° 
(ii) (x; —1,1; (i - 2’) *(1 — px’) '), uw S 1, again a symmetric case. 
We take here A = (1 — 2’)(1 — ua’), sothat B = —3z(1 — ya’). Thus, A and 
B have a common factor whose presence introduces marked improvements in 
our general considerations, as we proceed to show. 
We have here 


1 1 
(43) [ Ap?,G,4dz = [ (1 — 2°) @.G,_.4dr = 0. 
1 1 


* It would be of interest to devise an explicit expression for A, as a function of n. This 
would yield an expression for \, . The positiveness of all A, yields inequalities for P'(4). 
Thus, from A, > 0, A; > 0, 


3? > r(d)r te? > 2!. 








408 J. SHOHAT 


Introduce the following sequences of Jacobi polynomials 


(44) un = Salt; $$), vn = Ja(z3 2) 

for which, as is known (M, pp. 17, 33), 

(1 — 2*)v, — 32v, + n(n + 2)v, = 0, 
sin (n + ly 


2" sin yg 





(45) < v,(cos ¢) = 


¥ = Nai. 
(43) yields at once 
(46) o; = NUna + (n — Z)hata-3 = va + hid ie 
(47) @, = 2" + dant” +--+ = tn + Anda + Min. 


Here h;,, = 0. Forn = 2m + 1, this-follows immediately from %2m,:(0) = 
Vom,i(0) = 0 (m = 0,1,2,---). If n is even, we make use of the recurrence 
relation (1) for Pen , vn. 

We get further, by (46), (45), (23), 


(48) (1 — 2°)@% — 3r@), = —n(n + 2)vn — ha(n — 2)nvp-2. 
We have also (by (23)) 
(1 — 2°)@), — 3rd, = —(n + 3)0ner + hen + AswNn-n-s 
(ni, corresponds to ,), 
and this relation, through (1), leads to 
2(1 — 2°)@), — 32°, = [—(n + 3)a° — hsn — (n+ 3)An + hayalon 
es + [Ai(hsnat + (m + B)Nog3) + haya] On-s 


The desired differential equation is now obtained by eliminating v, , vn—-2 from 
(47), (48), (49). 
It remains to determine the various constants in the above relations. By 
comparing coefficients in (47), (49) and making use of 
* n—1 


2 7. n-z2 , (N—2)(n — 3) nae 
mn = 2 a: . © 32 i hee 


we get 


= fe eS ae ae ioe 
(50) hn Gaant T 4 4 (Ae + + An), 


Gaunt = 














m 








DIFFERENTIAL EQUATION FOR ORTHOGONAL POLYNOMIALS 409 


All reduces to finding h,. For this purpose we make use of 
1 _ ,2y8 1 _ 224 
(il — 2) x ,(x) dz = (1 «) x'(0n + havn—e) dx = 0 


ta 
Ly. 1 — po? L1 1 — po? 


(e = 0, 1 for n even, odd respectively), 


“cosaxdr _ =x j(l— p)' — \ 
| | p (ipl <D, 


where a is a positive integer or 0. We get*® 


2 2—- 
hn a Pd * os (v ons (1 = u)), , = eee, 
m 


(1 — ale + = - uz" | + {uz - | 20 —u) + = + 1 | aw. 


+ {(n + v)(n — 2v) — n° ya*} , = 0. 


(51) 


Note that our present method gives not only the differential equation under 
discussion, but also (and without any further considerations) the explicit ex- 
pressions for ®,(x), also for An, Anno, -*- - 

The corresponding formulas for the limiting cases p(x) = (1 — ri a- 2’)! 
are furnished by the same formulas, if we let u — 1, 0 respectively; in the lat- 
ter case we take vy = — (1 — u)', since here h, = 0 for all n. 

Remark. Using the known transformation from the symmetric to the cor- 
responding non-symmetric case (M, p. 19), we readily derive from the foregoing 
results differential equations of the desired type for 


ee 
(62) &,(2;0, o;e 2"), (:0, 1; fae “), a 


6. We now return to the general discussion. Take 


(53) A = (x — a)(b — 2), 
where (a, b) is the interval of orthogonality (with the customary agreement to 
replace x — a by unity, ifa = —«). (Similarly forb — x.) Make use of the 


following simple identity (integrate by parts): 
D b 
(4) [ sword = [ eyyaz, va) = ¥) =0, 


* The explicit expression for #,(z) may be obtained by the method of G. Szegé: Ueber 
die Entwickelung einer willkiirlichen Funktion nach den Polynomen eines Orthogonalsystems, 
Math. Zeitschrift, vol. 12(1922), pp. 61-94. Cf. also S. Bernstein, Jour. des Math., (9), 
vol. 9(1930), p. 175 and Comm. Soc. Math. Kharkoff, 1930. J. Geronimus (Kharkoff, 
Russia) obtained and communicated to the author this differential equation without, 
however, revealing the method used. 





410 J. SHOHAT 


which is valid for any two functions f, y, each possessing in (a, 6) first and 
second derivatives. We get, by virtue of (6), (7) and the orthogonality property 


of Gn 5 


b b b 
(55) i (Ape’)G.1dz = i on(ApG’,1)’ dz - | pe. BG’,-s dz. 


Consider, first, the classical OP of Jacobi (J) in (—1, 1), Laguerre (L), Her- 
mite (H). Here 5B = 1; hence, 


b 
[ (Apy,) G,.rdz = 0, 


(56) (Apen)’ = —Capyn (C, = const.). 
This is the classical differential equation (M, p. 33), with 
(57) C, = n(n + a + B — 1) (J), n (L), 2n (H). 

The consequences which follow directly from (56) have not been exhausted 
yet, as we proceed to show. In the first place, we write 


{ Apes! = AG@plaer(e) = —Cx [” peat a, 


| A(a)pl2)o’ (2) = O(C,), = O(n’) (J), = O(n) (L,H) (a2) 


since, by the Schwarz inequality, each of 


(58) 


[ rena, —[ pen(ea(t) dt = 0(1) (a<z<)) 


Making use of the relation’ 
(59) on(z; p) = Crgyna(z; Ap), 
we get from (58) 


60) A(2)p(2)ona(z; Ap) = —Ch f p@enlt)dt = OL) (a Sz $3, 


(61)  A(x)p(x)gn—s(z; Ap) = O(n) (J), O(n’) (H, L). 


On the other hand, if we make use of the known asymptotic expression for 
¢n(x) (M, pp. 62-64), we get from (60) 


[ vente) at = O(n) 


(62) 


z 


i p(t)ea(t) dt = O(n) 


7 J. Shohat, On the development of functions in series of orthogonal polynomials, Bull 
Amer. Math. Soc., vol. 41(1935), pp. 49-82; p. 75. 





DIFFERENTIAL EQUATION FOR ORTHOGONAL POLYNOMIALS 411 


(Here and hereafter ¢ and C stand respectively for an arbitrarily small and an 
arbitrarily large, but fixed, positive quantity.) Another result, derived very 
simply from (56), through (60), concerns the expansion of functions in series 
of classical OP. 

It is known that for these OP Parseval’s Formula holds, so that 


z x z b 
@) f rosoa= Ls f rOewa, f= f popentdat 


b 
it | p(t)f (t) dt exists; convergence uniform, a S$ z S 6). Hence, by (60), 


=1 


(4) = F@) =f posWat = LY — fC A@ple)ena(z; Ap) 
(convergence uniform, a S x S b), whence we get 


z b 
Tuncann 1. If Fiz) may be repressnted os [ p(t)f(t) dt and / p(t)f*(t) dt 


exists, then F(x)/A(x)p(x) can be expanded in a series of the polynomials 
(yn(z; Ap)}, namely, 


Fiz) SF pff’ : : 
65) Tae 7 p> & ( [ pldf@enit) dt) onal Ap), 


which converges uniformly in any interval wholly inside (a, b).* 


Note that no use was made here of the asymptotic properties of ¢,(z) and 
that we obtained explicit formulas for the coefficients in the expansion (65) in 
terms of those for f(x). Note also that, without any further assumption con- 
cerning f(x), we get for the remainder in the expansion (63), by (62), 


[ roo a - DK [ r@ede ae 


Ey 2 z 273 
s[ 54D ([ r@ema) | =o W), om) 4,1, 


t=n+ 

where x is as in (62). With p(x) = 1 and (a, b) finite, this leads to the approx- 
imation of a certain class of absolutely continuous functions, by means of poly- 
nomials—certain indefinite integrals of Legendre polynomials. 

Consider, next, the case where 6B S 2: 
(67) B=hr +kr +l. 

While the differential equation for ¢,(z) may be derived by the method 
developed above, here we center our attention on ¢.(z) and on 


(68) K,(z) = } oi(x) = Nisslenss(z)en(z) — on(x)ensi(z)] (M, p. 25). 


* Cf. loc. cit. in footnote 7, pp. 71-76; see also (for Laguerre and Hermite polynomials) 
H. Weyl, Singuldre Integralgleichungen --- , Thesis, Géttingen (1908), pp. 63, 74, where 
F(z) is subject to more restrictive conditions. 





412 J. SHOHAT 


By (5), 
(69) p(t) = e “(2 — a) "(b — x)*" (a, B > 0; a, b finite), 
(70) p(x) = ehte(e — a)" (a finite,b = «;r <0, ifh = 0), 
[In case (a, b) = (— ©, ©), we get again Hermite polynomials.] Apply (55) to 
G,-2(x), and we get 

[ (Apgn)’ Gn2dx ” [ P¢n(BGn-2) dx = 0, 


(71) (Apen)’ + CaGn — Ra ntiPn+1 + Ra.n—1¢n—-1 ’ 
where Cy, ha.nyi, An.n-1 are constants. Comparing the coefficients of x” gives 
at once 


an 


Aanst = nh- - => nhy 42. 


Qn+1 
Moreover, by the orthogonality property and making use of (55), we have 
b b 
han = / (Apgn)’ gn dx = i PBengn—1 dz = (n — IAA‘, 
whence, 


(72) Ags + Bon + Cagn = hindbsegnua + (n — Abs g0-l, 


(73) 


” = n(n — 1) — nk — AS, — nhenss, in case (69), 


C, = —nk — hS, — nheas, in ease (70). 
Making use of the recurrence relation (M, p. 24), we obtain 

Nb ssynsia(z) = (x — ensrden(z) — Margnr(z). 
We get further 
(74) Agn + Bon + Dagn = —hdns1¢0-1 
n(n — 1) — AS, — n(hx +k), in case (69), 
—hS, — n(hx + k), in case (70). 


(75) D, = Cn — nh(x — Cay) = | 


Rewrite (74), by virtue of (5), (7), as 
(Apy,)’ aa — AN 1 Peni — Dapen, 


whence, 


(76) A(z)p(x)en(x) = —AM as [ P(t)gen—a(t) dt — [ P(t)Da(t)gn(t) dt. 





DIFFERENTIAL EQUATION FOR ORTHOGONAL POLYNOMIALS 413 


We now turn to K,(z). By virtue of (68), we readily get from (74) the fol- 
lowing differential equation for K,(z): 


(77 AKi,(z) + BK, (2) = Nb 42(Dp yi Dass) On Gn+t 
dé 
os Arb eb see? _ rE seessee-ad, 


whence, by (75),° 


A(x) p(x) K,.(x) _ rb 6 [ [D,(t) = Da+ilt)) onOensarl dt 
(78) ° 


+ hvb is / [jae — Absrensa(t)ona(t)] af, 


D, (2) — Dass(x) = —2n + hens + he + k, Reng + he +k 
(in cases (69), (70) respectively). 


In case (a, 6) is finite, reduced, without loss of generality, to (—1, 1), we 
know that 


c, = O(1), S, = O(n). 

Hence, here 
(80) C, = n(n — 1)[1 + O(n’); 
(81) A(x)p(z)K,(z) = O(n) (-13 
[A(x)p(2)]* | en(x) | = O(n’) (-1 

| en(a) | = O(n') (-l+es2 
and, by the Markoff-Bernstein Theorem, 

| en(a) | = O(n’) 

(-l+e+te Sx 51-—€ — €,ie., z inside (—1, 1)). 


(82) 


(83) 


The estimate (81) may be applied to the study of the expansion 
oo 1 oo 
reo) ~ Z| [. nhende | enle) =X Sova 


as 2X fie(e) + R,(x) = S, + R,, 


where f(z) is assumed continuous in (—1, 1). We have, denoting by £,(f) 
the “best approximation’’—in the Tchebycheff sense—of f(z) on (—1, 1) by 
polynomials of degree < n,"° 

* Cf., for the classical orthogonal polynomials, J. Shohat and C. Winston, On mechanical 


quadratures, Rendic. Cire. Mat. Palermo, vol. 58(1934), pp. 1-13; pp. 4-5. 
” Loc. cit. (footnote 7), p. 59. 





414 J. SHOHAT 


(84) | R(x)! < E,(f){1 + [K,(z)}'} = O(B.(fyn') (-l+eS78 


by (81). 

This leads to 

Tueorem II. The expansion of f(x) in a series of OP, where p(x) = e™ 
(1 + x)*7(1 — 2)** (a, 8 > 0), converges uniformly in any interval wholly inside 
(—1, 1), provided f(x) satisfies therein a Lipschitz condition of order > }3. 

Some of the foregoing results can be, and have been, improved, by use of more 
refined methods. Here they have been derived by means of very simple con- 
siderations as a direct sequel to the differential equation for the OP under 
discussion. 

The same elementary considerations lead to new and important results in 
the theory of mechanical quadratures, as we proceed to show in the closing 


section. 


7. We turn once more to the classical OP, i.e., we let h = 0 in (67). Formula 
(78) now becomes 


(8 A(a)p(2)Kq(2) = Nhia(k — 2n) f " p(Dealtens(t) dt 


Denoting the zeros of ¢,(z) by z;,. (1 S i S n), with 
(86) @ < Srost < Pin < Baers < +++ < San X Serta <5 
we learn from (85) that 
(is maximum at z = 
(87) A(x)p(z)Ki(z)). |, 
\is minimum at r = 2j,n (1 
Pr(Zinps)Ka(Z1n41) > Pr(tin)Kn(Zi.n) < Prlten+1)Ka(t2np1) > +>: 
> piltnn)Kn(Gain) < Prl@nsinst)Kn(2nsi.ns1) 
Hence, 
(89) Pr(Zin)Ku(Xin) < Prltirens1)Ka(Ti+e.n+1) 
Introduce the Gaussian Mechanical Quadratures Formula 


K, (2.2) K,-i(2i.n) 


b n 
(90) / pleif(z) dr = Le Hinfltin), Hin 


Then (88), (89) become 
Hint Hin Hens nae Han Ha+in+i 
P1(21,n41) Pi(21.n) Pi(Te.n+41) Di(Tn.n) Pi(Tn+in41) 


Hin < Hiizen+i 
Pi(Xi.n) Dil i+e.n+1) 


(91) 
Sisn;o =0,)). 





DIFFERENTIAL EQUATION FOR ORTHOGONAL POLYNOMIALS 


We have further 
(92) p(t) = Bp; B=a—B—(at+8)r(J), a—2x(L), —2r (H). 


Hence, p:(x) is increasing in (a, c) and decreasing in (c, b), where 
(93) c= SSF), al), 0); c= mu. 


(Take n = 1 in the differential equation for the classical OP.) (91), combined 
with (86), now gives” 


(94) H,, > Piltin) Hin; Hin > PE) es na. 


Pi(Xi.n+1) Pil Xi41,n41) 
Consider, first, Jacobi polynomials. We learn from (91) that 
Ain Ai n41 . ; Ai nse a 
Pi(Z1,n) Pi(X1,n41) Pi(X1,n42) 


Bas An41n41 Hn42,n42 Arie 
Pil2n,n) P1(In41,n41) Pi(In+2,.n42) 


ae 


(95) 


and this assures the existence of the following limits: 


(96) lim Hin = h’ (2 0), lim Hn 


= h” (2 0). 
n~9@ Pi(Xi,n) n—-2 Pi(2n,n) 


We proceed to show that both h’ and h” are positive. For this purpose we make 
use of Tchebycheff inequalities 


Hi» > | p(x) dz, Ban >| p(x) dr. 


Ti,n 
Hin [ (1 + 2)* "(1 — 2)* "dr 
Pi(21,n) (1 + 2in)*(1 — 21,n)* 
Similarly, 
Han 1 
e. De 
Pi(Ln,n) 28 
(A still simpler procedure is to employ the almost evident inequalities 


1 1 1 1 
" > = ; Han > 
K,(21,n) K,(—1) 


Ki(2nn)  K,(1) 
" This is better than Hi.n > Hi.nsr, Hiss.nyi , respectively, as given in page 2 of the 
reference in footnote 9 and yields important results below. 


Ain = 





416 J. SHOHAT 
and to make use of the known values of K,(+1).) Still easier is it to obtain 
upper bounds for Hi,,, Ha». By (91), 


Ay, Ho. Han 
' ae as , ioe. 
Pr(21,1) Pi(22.2) PilLn.n) 


Ay. Hi Ts Hin y ... 
Pi(21,1) Pi(21,2) Pi(21,n) 


Here we use 


Hy, = / p(x) (x ar dz = i] p(x) dz, 


12 
and we get 


(a + B)*"*T(a)T (8) 


_ MS Rae Ta + 8) 


Similar simple considerations, applied to Laguerre polynomials (where we 
may use the inequality H;,, > 1/K,(0)), yield 


1 < Hii. < T'(a) 
a Pprltin)  e*a* 


(Note the resulting inequality, valid for any a > 0:I'(a) > e “a*'.) We 
summarize our results as follows. For Jacobi polynomials [in (—1, 1)] His 
and H,,,, behave asymptotically (n — ©) like (1 + 21,,)* and (1 — Zin)” respec- 


Tin 


1 
tively, or, which is the same, like [ p(x) dz, / p(x) dr respectively. For 


— Zan 


Laguerre polynomials, H,,,, behaves asymptotically like x{.,, or, which is the same, 


like [ z p(x)dz. Namely, 
0 


lim Hin = J'(a, B), lim Hn = J""(a, B), 


awe (1 + 21,n)* no (i + Ian)? 
2a + 6)" (a) (8) 
at PPT (a + B) 
2° "(a + B)** T(a)T(8) 
a*B°T(a + B) 


> J'(a, B) 


> J"'(a, B) 


Hi,» 


lim exists and is 


ane | ° p(x) dx 
1 


22 As a concrete illustration, we may utilize the trigonometric polynomials (a = 
Here Hin = Ho,» > °°? & Has = an, Pi(2i.n) = Pi(Zn.n) = (i _ z3.n)} = 
[1 + O(n-*)], so that lim A, ../pi(zi.n) = 2. 

n-? 2 





DIFFERENTIAL EQUATION FOR ORTHOGONAL POLYNOMIALS 


T(aje“a * > L'(a) = lim — 2 


(98, L) Hy, n 


lim = ~~ exists and is 2 


a ” p(z) dz 


These formulas give the asymptotic expressions for H,,, , H,»,, if those for 
Zin, 2n.n are known, and vice versa (see below). The known relation of Her- 
mite polynomials to the polynomials of Laguerre, with a = 4, $ (M, p. 20), 
gives at once, with obvious notations, 


Ain 
2 


Hin 


° as 3 
271.5 (L; a 3), 


Hn+1,20 (H) = (L; = = 3); Haseans (H) = 


and by the above, 


. Hasan 
lim 


nc In+1,2n 


=H’,  (4ne)' > H’ 21; 
(98, H) 
lim Hates _ yy”, (gyxe’)' > H” > }. 
n—-0 In+2,2n+1 

Thus, for Hermite polynomials, the coefficient in the mechanical quadrature 
formula corresponding to the zero nearest to 0 behaves asymptotically like this 
zero itself, which is in accordance with a result obtained by Winston” and de- 
rived by a less elementary method (Sonine’s method). 

On the other hand, if we make use of the asymptotic expression for some of 
the H;,, in Hermite’s case,"* we get from the above formulas, without any 
further consideration, not only the true order, but also the asymptotic expression 
(n + «) for the corresponding zeros. Thus, 


. ‘ T , . i T 
(H) exists lim n'ansi0n = ~ exists lim 7’ 2ns22n41 = . 


2 2 
(L) exists lim Nin = aw (a = 3); exists lim Nin = i’ 
[The existing estimates generally deal with the largest zeros for (H), (L).] 
These asymptotic estimates may be combined with the estimate of the minimum 
distance 5(n) between two successive zeros for Hermite polynomials. In fact, 
according to Hille,” 22,41... (H) = 5(2n); 2ns2,2n41 (H) = 6(2n + 1). 
The author hopes to return to the considerations developed above. 


(a = %). 


Tue UNIVERSITY OF PENNSYLVANIA. 


43C. Winston, On mechanical quadratures formulae --- , Annals of Math., vol. 35(1934), 
pp. 658-677. 

* Loc. cit. (footnote 7), p. 13; also loc. cit. (footnote 9), p. 667. 

*® E. Hille, Ueber die Nullstellen der Hermiteschen Polynome, Jahresber. Deutsch. Math.- 
Ver., vol. 44(1934), pp. 162-165. 





ON BERNOULLI’S NUMBERS AND FERMAT’S LAST THEOREM 
(SECOND PAPER) 


By H. 8S. VANDIVER 


1. Further examination of Fermat’s Last Theorem for special exponents. 
In the first paper under the present title’ the writer gave some of the details 
of the computations which resulted in the proof of Fermat’s Last Theorem for 
all prime exponents / such that 307 < 1 < 617, with the exception of 587. At 
the end of the paper it is stated that the work has been carried out for 587 and 
since the criteria are found to hold, the theorem is proved for that exponent. 
The details are as follows. As noted in B.F. (p. 576) the numbers in the set 


(1) B,, Be, een » Byu-» 
which are divisible by 1 when 1 = 587 are By and Bg , so that 587 is irregular 
and, as in the treatment of irregular primes in B.F., we employ Theorem 1 of 
that paper which we repeat here for easy reference: 

THEOREM 1. Under the assumptions: none of the units E, (a = a; , 2, -+~ , @) 
is congruent to the l-th power of an integer in k(¢) modulo p, where » is a prime 


ideal divisor of p; p is a prime < ( — 1) of the form 1 + lk; and a, a2, --- , 4 
are the subscripts of the B’s in the set (1) which are divisible by 1; the relation 


(2) ait+y' +2’ =0 


is impossible in non-zero integers x, y and z, if l is a given odd prime, and 


$(l-—3) 


E,= [I «¢")’ » 


i=0 


—2in 


a (s -9)a - #)) 
ag-pad-¢)/’ 

r being a primitive root of Land ¢ = é""". 
Applying this to the case 1 = 587, we find for r = —10,d = 2“, p = 8219, 
p = 2and n = 45, ind E,(d) = 576 (mod 587) and for n = 46, ind E,(d) = # 
(mod 587). Here, as in B.F., d is an integer such that d’ = 1 (mod p) and» 
is a primitive root of p. Since ind E,(d) # 0 (mod J) in the above, the criteria 
of the theorem are satisfied and Fermat’s Last Theorem is proved for | = 587. 
As noted in B.F. (p. 576) the prime 617 is irregular and By, Bs; and By 
constitute all the B’s in the set (1) which are divisible by 1. Then applying 
Theorem 1, we find for r = 410, d = 3°, p = 4937, p = 3, ind Ey(d) = 55; 


Received January 30, 1939. 
1 This Journal, vol. 3(1937), pp. 569-584. This paper will be referred to here as B.F. 


418 





BERNOULLI’S NUMBERS AND FERMAT’S LAST THEOREM 419 


ind Es:(d) = 376; ind Eyo(d) = 25 (mod 617). In view of this, Fermat’s Last 
Theorem is proved for all exponents < 619 and the second factor of the class 
number of k(¢) is prime to / for all l’s < 619. (Cf. B.F., Theorem 3, p. 581.) 

The above-mentioned extensive computations concerning / = 587 and! = 617 
were carried out by M. E. Tittle, who, with M. M. Abernathy and D. H. Lehmer, 
also directed or carried out all the computations described in B.F. 

The prime | = 617 just disposed of is a particularly interesting one since it is 
the first prime so far encountered in our work in which three distinct B’s in (1) 
are divisible by 1. Since the criteria hold for this case, there is no indication 
yet that they will fail because of the number of B’s in (1) which are divisible by I. 

In B.F. (top of p. 570) we noted that we have persisted in the examination 
of special exponents in (1) in the hope that, if the criteria of Theorem 1 fail 
for a particular /, we shall find such an / within the range of our computations. 
We shall point out here, however, that the possible fact that they might fail 
for a particular / will not immediately appear when we apply Theorem 1. For, 
suppose we find that E,(d) = 0 (mod J) for some particular d, | and p; then of 
course it does not necessarily follow that there will not be another p, say p, , 
in the range mentioned in Theorem 1 such that if dj = 1 (mod p,), 


ind E,(d,) # 0 (mod I). 


But since ind £,(d) # 0 (mod I) for every value of / so far tested with the least 
possible value of p which satisfies p = 1 + cl, we should be a little suspicious 
about any exponent / for which it was found that Z,(d) = 0 (mod J) for the 
first two or three possible values for p, and then it would be in order to subject 
the cyclotomic field k(¢) corresponding to this particular value of | to a special 
examination. For example, we would try to find out if either of the congruences 
(u = 3(l — 1)) 
B,, = 0 (mod P), Bui, = 0 (mod [’) 


held. (Cf. B.F., p. 582.) 

2. Possible extensions of Theorem 1. This theorem contains the limitation 
“pisa prime < (l’ — 1) of the form (1 + kl)’. In the first place it is not known 
that a prime p of this type exists for every 1. Hence we examine the possibility 


of extending the theorem by widening the range for p. As in a previous paper’ 
we consider 


(3) w + 6 = —-y’, 
where 6 = 0 (mod X‘), s > 0, A = (1 — 9), and obtain therefrom, if p = 1 + el, 
(w + 0g*)’ = @ + OF“) (mod p), 


? Transactions of the American Mathematical Society, vol. 31(1929), pp. 631-632, rela- 
tions (22) and (24a). 





420 H. S. VANDIVER 


where p is defined as in Theorem 1, with ¢c = k. From this we obtain, by ex- 
pansion, allowing a to range over the set 0, 1, --- ,2 — 1 and eliminating ¢, 


c snc—s c atl nce—(stl) 

(‘). G+ (, + ))e 6 + -:- 
ie c l—s pe—(1—s) c 2l—s ge—(2l—a) a 
=(,°,)« , +(o° ,)« , ™ 


(4) 


modulo p, where s = 0,1, ---,1 — l and (;) =Oforh>c. Takingc <l-1 


and s = c — 1, assuming that w is prime to p, we have @ = 0 (mod p), and thisis 
analogous to the relation (25a) of the paper just mentioned. For larger values 
of c these congruences (4) are more complicated, but obviously we may state 
criteria concerning (2) in Case II, based on them. These congruences are 
related to some congruences given in a previous paper.’ 

Another point of view is to take (2) and obtain therefrom 


r+ Sy = naa, 
where y = 0 (mod 1). Now if just one of the B’s in the set (1) is divisible by /, 
say B,,, then it is possible to show that since y, is primary, 
ta = E, 6, 
and since E*"" = g' (using the Kronecker-Hilbert notation of symbolic powers) 
we obtain 


rin | 


rt ey =(rx+ Sy) a1, 
with o; a number in k(¢), and therefore 
(x + ey) = (x + Sy)” (mod p). 


By expansion and eliminating ¢ by using various values of a, we obtain another 
set of congruences involving z and y. By taking r’, r’, ete. in lieu of r we obtain 
numerous relations of this type. If more than one B in (1) is divisible by |, 
then we may extend this idea, and by taking a certain symbolic power of (z + 
¢“y), we may obtain congruences which become quite complicated when 4 
number of B’s are divisible by 1. This line of attack seems to have the pecu- 
liarity, however, that it yields results depending on the fact that z and y are 
rational. 


3. Case I of Fermat’s Last Theorem and the second factor of the cyclotomic 
class number. Take the relation (2) and assume that ryz # 0 (mod J) and 
(x, y, z) = 1; then 


(5) r+ ity =a’, 


> Proc. Nat'l. Acad. Sei., vol. 15(1929), theorem on p. 45. 





BERNOULLI’S NUMBERS AND FERMAT’S LAST THEOREM 421 


where a is an ideal in k(¢). If we use some results of Pollaczek,* it follows that 
we have 
(5a) a+ fy = qoi'ws’ --- 056", 
where the w’s are singular numbers; that is, we have (w;) = 6;, where @ is not 
a principal ideal in A(¢); 7 is a unit and @ an integer in k(¢). 
Each w has the property that 
r rt l 
(6) w(f") = @(S))' +, 


r being a primitive root of 1 and r a number in k(¢). Hence’ either wo, = y’ 
orw/w_; = £' according as 7 is odd or even, w, denoting the number obtained from 
w by the substitution (¢/*). If there is anw = 6” which satisfies (6) for 7 odd 
and which also divides (x + fy), then we proceed in the following way. We 
introduce /-th power characters defined by 


(? ef. gX@-Dip — (*) (mod p), 


p being an ideal prime in k(¢) prime to 1@ and N(p) = p’ the norm of p. Further 
for a an ideal prime to / and @, let 


Q = Pipe--- D, 


where the p’s are ideal primes in k(¢); then 


(3) - GG) @). 


From (6) 6b_, , since 7 is odd, is a principal ideal, say (a), and we may apply 
Furtwangler’s law of reciprocity and obtain, if a # +1 (mod 1), a # 0 (mod J), 


(8) = (82) 
a { r+ fy) 


From (5) 


i § y 


(7) (ZtFs) .(o2try) 
j a { , 
a#+1(modl). If weset 


(7a) (* -d . ; ’} = f°, 


‘ Math. Zeitschrift, vol. 21(1924), pp. 1-39. 
* Pollaczek, loc. cit., p. 22. 








422 H. S. VANDIVER 


then, if V(8) is the norm of 8 in k(¢), 


heii wet y = + al,(x + fy) N = real. 
(7b) 7 
+ Do (-1)" 7a" 1 (a)l'(w + Sy). 
s=2 
Also, if 
@=at+tagvt+---+ oa’, 
then 
6(e") = a& ao aye" + tae + ent?) 
and 


1 (6) = [ ag | _ 


Now by a result in a previous paper of the writer’s” we have 


Ne + fy) — 1 _ 9 (mod) 
l b J 


and by Furtwangler’s result that if c divides z in (2), then 
c' = 1 (mod P), 


we obtain, since b and b_, divide z by hypothesis, 


—, = = 0 (mod J). 
Now 
(8) (: + y) a (: tty) = (e+ ya + Fy)! _ 
a bb_, b 
and 
rtsy\_ (xrt+ty+- iy) " (é - im) 
an (CfA) = (etm =) (65,00) 


Similarly, we have 


os (2478). (©= FW), 


We also employ the relation’ 


r! ome li 2n = ° P 
ind ( ') = (¢-—l1)indt -2 >> (t 1) ind E,(¢) 
t — l 2 ani rs — ] 


¢ Proc. Nat'l. Acad. Sci., vol. 15(1929), p. 44. 
7 Kummer, Journal fiir Math., vol. 56(1859), p. 277. 





m 


an 


Th 


an 


Fr 
of | 





F Sy). 





BERNOULLI’S NUMBERS AND FERMAT’S LAST THEOREM 423 


modulo 1, where 1, = 3(l — 3) and (6/6) = ¢"*°. Applying this to (7), using 
(7a), (7b), (8), (8a), and (8b), we have, noting that ind (¢) = 0, 


2> ((a — 1)" — 1) ind E, (5) | o> ((a + i — 1) ind E,(¢) 


n=1 7 1 n=1 r? — j 
1-2 
(9) = > (—1)* 1a “1 (al (a + ty) 
s=2 


modulo 1. From this relation we obtain by expansion a congruence which 
may be put in the form 


1-2 
—4(a""* Aon + a?” * Ang + +++ + aA;) = > (—1)"* aU (al (x + gy), 


modulo 1. After dividing by a we may write 


$(1—3) 


~4o << a" *(4Aon-1 + (—3)? P(e” (zx + ty)) 
(10) en 
-_ > a! 28) 272+) (qin My + cy) a a’ *P (a)? (x + cy). 
s,=1 


By the Kummer criteria for Fermat’s Last Theorem, we have 


Bl (x + gy) = 0 (mod J), 
and since B, ¥ 0 (mod J), 
i (2 + gy) = 0 (mod I). 


Employing this in (10) and setting a = 2, 3,---, (J — 2) in turn, we have 
(| — 3) congruences, and since the determinant of the powers of the a’s is an 
alternant which is prime to I, since each a < 1, we have 


Ao =0, 4A na — "Pal?" (a + Sy) = 0 (mod J), 
(11) (n = 2,3,---,3(1 — 3)); 
12 F) (gyi ?—Y (2 + gy) = 0 (mod J), (s = 1, 2,--- , $l — 3)). 
The last relations are trivial since 
(a) = bb, 
and hence a belongs to the real field k(¢ + ¢"") and 


= eee) = 0 (mod J). 
v=0 


dy*t+ 


From the relations (11) we obtain for n = 3(1 — 3), by using the actual values 
of the A’s from (9), 





424 H. S. VANDIVER 


‘ l — 3 . » wa: 
(12) Ais = 3 ay ind E,,(¢), 


3 ris | yi 


(13) pe ( - *) ind E;,(¢) + (1 — 5) ind Er, aS) 


Also, in the Kummer criteria 
BI" (2 + gy) = 0 (mod J) 
for the solution of (2), let n = 2, 3, 4, 5,6 and 7. These criteria give, as in 
another paper” 
i'?"’(z + gy) = 0 (mod J) (n = 2, 3, 4, 5, 6, 7), 


and these congruences applied to (11), (12), and (13) give 
ind £,,(¢) = ind £,,1(¢) = 0 (mod J), 
and similarly, by taking the values of A;-s, etc. in (11), we find 
ind E,,_(¢) = 0 (mod /) (j = 2, 3, 4, 5). 


We may now follow the argument as given on pages 122 and 123 of the paper 
last mentioned and obtain the contradiction mentioned there just at the end of 
paragraph (1), page 123, and we have another proof of Theorem 1, page 118: 

If (2) is impossible in Case 1, then the second factor of the class number of the 
cyclotomic field k(¢) is prime tol. The relation (11) of the present paper is different, 
however, from any of those given in the former paper, and in the main the ideas 
given here constitute extensions of those employed there. Now, instead of taking 


(229 
a 


and treating it as above, we might have used 


") 
2) 


where (8) = 6" *'. We would thereby obtain 


(7 + ry) a ( + ey)" (x + of) 


(14) 3 6 


where rr; = 1 (mod I). 

We may then treat (14) as (8) was treated and obtain a set of congruence 
analogous to (9), but more general. 

By employing (5a) and (6) and proceeding as in another paper of this writer's, 


8 Vandiver, Bull. Amer. Math. Soc., vol. 40(1934), p. 122. 
* On criteria for singular integers in a cyclotomic field, Proc. Nat'l. Acad. Sci., vol. 4 


(1938), pp. 330-333. 





ences 


ter’s, 


vol. a 











BERNOULLI’S NUMBERS AND FERMAT’S LAST THEOREM 425 


we obtain 
i-—1 
(15) IT (x + 2 lie ” wy, 
where 
oS) = (ols), 
o being a number in k(¢), with s in the set 1, 2, --- , $(1 — 3). 


4. Criteria involving Bernoulli numbers. In B.F., pp. 582-583, we men- 
tioned a Theorem 4, giving criteria for Fermat’s Last Theorem involving the 
assumption that none of the Bernoulli numbers 


But (n = 1,2,---, $(l — 3)) 
is divisible by I’. Elsewhere” we have stated that if (2) holds in Case I, then 
B, = 0 (mod I’) (s = nw — 134 = 2, 3, 4, 5,6; 4 = 4(l — 1)), 


where each n ranges over all positive integers. In the reference to B.F. just 
mentioned (middle of page 582) it was shown that if two of the numbers in the 
set 


(16) B,, ’ Bats grr Basc-»p 


are divisible by I’, then all of them are. Also, a necessary condition that k(¢) 
contain an ideal belonging to the exponent [’ is that one of the numbers Byci41 
(t= 1,3, --- ,l — 4) be divisible by P’. 

We shall now consider methods for testing these various criteria in special 
cases. In a previous paper’ the writer described a method for obtaining the 
least residue of B,; modulo /° where B, = 0 (mod 1), using the formula 


(—1)°" B22" — 1) 


= = ie + geet + Jen + (1 _ i . 


modulo ’. Beeger™ obtained the congruence 


_yi-t Ba Spy wut {Qn 2n — 8 Ba+(o—vp 
ae 2nl p> (“1 (, _ )( i-s ar 1) 


modulo I’, 2n # 0 (mod / — 1). 
This gives fori = 2 
l 


Bu = -— ((2n — 1)°B, — (—1)*(2n)* Busy) 
2n — 1 


Bull. Amer. Math. Soc., vol. 40(1934), p. 124. 

" Trans. Amer. Math. Soc., vol. 31(1929), p. 639. 

2 On some new congruences in the theory of Bernoulli’s numbers, Bull. Amer. Math. Soc., 
vol. 44(1938), p. 688. 





426 H. 8S. VANDIVER 


modulo I’. This was employed by Beeger to show that B,,; 4 0 (mod 0’) for 
any B, such that B, = 0 (mod 1) and for! any prime < 211 (with several excep. 
tions which were not tested). Emma Lehmer”™ derived the congruence 

(p/6) 


2n—1 2n—-1 2n—1 
(19) a +37 =D LS @— o)™ (mod 
r=] 


where | > 5 and [2] is the greatest integer in z. 

Putting these methods and results together, we evolve the following con- 
venient scheme for finding which numbers in the set (16) are divisible by f. 
Take a congruence employed by Pollazcek™ 


(20) Bray. = yBia, — (y — 1)B) (mod P), 


where 


Bi = (—1)'B, 
u 


and y 2 0. Suppose first, using (19), we find that 
(21) B44 = Bi (mod P); 
then (20) reduces to 

Brive = B, (mod P), 


so that there is no member of the set (16) divisible by I’ unless all are. 

Suppose next that (21) does not hold; then determine 0 < y < / from the 

linear congruence 

y wie = Be * = 0 (mod J), 
whence, from (20), Bi), = 0 (mod P), and this is the unique B in (16) whichis 
divisible by 7’. For, otherwise all the B’s in (16) are divisible by [’, and (21) 
holds, contrary to hypothesis. 

As we have already noted, the conditions B,,;_, = 0 (mod [’) and B,,; =? 
(mod /*) occur in connection with various questions concerning k(f). The 
relation (21) also appears in theorems concerning the field defined by a primitive 
?-th root of unity.” In employing (19) it may be most convenient when / is 


18 Annals of Math., vol. 39(1938), p. 352, relation (13). 
4 Math. Zeitschrift, vol. 21(1924), p. 36. This congruence was generalized by Beeger, 


loc. cit., p. 684. 
16 Cf. Pollaczek, loc. cit., p. 29; Morishima, Japanese Journal of Math., vol. 11(1935), 


p. 239. 





nm | is 


eeger, 


(1935), 


BERNOULLI’S NUMBERS AND FERMAT’S LAST THEOREM 


large to write it in the form 


[1/6] 


2n—1 2n—1 ~sS 
) B,(6 +3 = 2 1) = > gt + (2n = 1)6""? *"*)]) 
r=] 


(22 


modulo 1’; for, as indicated elsewhere,” it is convenient in finding the least 
residue of (ak)""* modulo ?° to obtain it from the least residues of a” and k"™. 
The least residue of the coefficient of J in the right member of (22) may be 
determined easily if we use Jacobi’s table of indices. 


UNIVERSITY OF TEXAS. 


% Vandiver, Trans. Amer. Math. Soc., vol. 31(1929), p. 641. 





NOTE ON TOPOLOGICAL MAPPINGS 


By J. H. RosBerts 


E. W. Miller’ has given an example of an acyclic curve M such that if f is any 
topological mapping of M into a subset of itself, then f(M) = M. R. Baer’ 
has given an example of an acyclic curve M such that if f is a topological function 
and f(M) = M, then f is the identity. Neither of these examples has both of the 
properties mentioned above. O. Hamilton’ has raised the question as to 
whether or not any acyclic curve has both the above properties. The present 
paper answers this question in the affirmative by describing a compact acyclic 
continuous curve H such that the only topological function mapping H into a 
subset of itself is the identity. 

Now Menger’s* “universal tree of order 4” is made up as follows: (1) There 
is a single interval S which is called the interval of the ‘0-th degree’. (2) For 
each point P of a countable set 7’) dense on S, but not containing an end-point 
of S, there are two intervals having P as end-point, these intervals being of the 
l-st degree. (3) In general, for every n 2 0 there is a countable set 7’, dense 
on every interval of the n-th degree and for each point P of 7, there are two 
intervals having P as end-point, these intervals being the intervals of the 
(n + 1)-th degree. (4) The curve M is the sum of all the intervals of all the 
different degrees, plus all limit points of this sum. 

Our curve H will be defined as a subset of Menger’s curve M. To get H we 
modify M in this way: Having decided that a certain interval J of degree r 
(in M) is to be in H, we may wish to have only one interval of degree r + 1 
for each of the junction points on J. In this case we select arbitrarily (to be a 
part of H) one of the two intervals of degree r + 1 ending in each junction point 
on J. In the future we will indicate this by writing “the junction points on J 
are to be of order 3 in H”’. 

It is convenient to use the following notation: Suppose P is a junction point 
of M onan are of degree r. Then an arc I of degree > r is said to “join on 
through P”’ if P separates J (or J — P) from S (or S — P). 

We now set up a 1-1 correspondence between the set of all finite permu- 
tations of positive integers and the integers of the form 2°. Let 2;;.... be the 


Received March 9, 1939. 

1 The Zarankiewicz problem, Bull. Amer. Math. Soc., vol. 38(1932), pp. 831-834. 

? Beziehungen zwischen den Grundbegriffen der Topologie, Sitzungsberichte der Heidel- 
berger Akademie der Wissenschaften, 1929, no. 15. 

3 Fized points under transformations of continua, Trans. Amer. Math. Soc., vol. 44(1938), 
pp. 18-24; especially p. 24. 

* Kurventheorie, Leipzig, 1932, p. 318. 


428 














el- 





NOTE ON TOPOLOGICAL MAPPINGS 429 


integer associated with the permutation 7 ---k. The peculiar property of 
. k . . . 
the integers of the form 2° of which we make use here is the following: 
I. No one of them is equal to a sum of any of the remaining ones. 


Let P;, P2, --- be the junction points of M on S. This interval is to be in H, 
and the points P;, P2, --- are all to be of order 3 in H. Let J be an interval 
of degree r which is in H and which joins on through P;. If r < 2; — 1, then 
the junction points on J are all to be of order 3 in H. If r = 2; — 1, these 
points are all to be of order 4. Let {Pi;} (j = 1, 2,--- ) be a countable set 


dense on every are of degree x; — 1 which is in H and which joins on through P; . 
These points are of order 3 in H. Suppose now the are J of degree r (2 2;) is 
in H and joins on through P;;. Then ifr < x; + 2;; — 1, the junction points 
on J are to be of order 3in H. If r = 2; + 2;; — 1, these junction points are 
to be of order 4. We can continue in this way to specify the order of the junc- 
tion points (to be 3 or 4) so that the following properties obtain: 

(1) On any given interval all junction points (except possibly one end-point 
of this interval) are of the same order. 

(2) Suppose 7j --- kl is a permutation of positive integers, J is an interval 
of degree r which is in H and which joins on through the point P;;...4: . 

Let 


QM = + Tit ess + ij..%, Q2 = Hh + Lij.. ke. 


If gq: S r < qe — 1, then the junction points on J are of order 3 in H. 
If r = gq. — 1, these junction points are of order 4. The point set 


{Pi;...21m] (m = 1, 2,3,---) 


is dense on every interval of degree gz — 1 in H which joins on through P;;...4: . 

This completes the definition of H. 

We note for future use the following property: 

Il. Suppose Z, , Z2, --+ is a sequence of junction points, such that (1) the arc 
ZZi.: (in H) is a subset of a single interval, and (2) if the degree of the interval 
containing Z,Z2 is s + 1, then the degree of the interval containing Z;Zi41 is 8 + 1. 
Then if Z, and Z;, are of order 4, and k > h, k — h is a sum of numbers of the 
set {2i;...2}. 

Suppose now that f is a topological function mapping H into a subset of H. 
We wish to show that for every P ¢ H we have f(P) = P. Since the set { P;;....} 
is dense in H, it will be sufficient to show that for any ij --- k we have f(P,;;...«) 
= P;;.... Suppose that this is false and that f(P) = Q # P, where P is 
some P;;.. 4. 

Select mutually exclusive open sets U’ and V in H containing P and Q respec- 
tively, and such that f(U) CV. Let PX be an are which is a subset of an 
interval in H and let r be the degree of this interval. Then there exists’ a point 


* This follows easily from Menger’s ‘‘n-Beinsatz’’, ibid., p. 214. The immediate assump- 
tion is the following theorem: If H is an acyclic continuous curve, B is a point of H of 
order n, and A,B is an are in H, for i = 1, 2, --- , n, n + 1, then for some i and j (i # J) 
there is a point X such that the are XB in H is a subare both of A,B and of A;B. 








430 J. H. ROBERTS 


Y; on the are PX such that (1) Y; is a junction point of M, (2) PY; is in U, 
and (3) if f(PY:) = QZ,, then QZ, is a subset of a unique interval of H, say one 
of degree s. 

Then on the interval (or on one of the two intervals) in H of degree r + | 
having Y; as an end-point there exists a point Y: such that (1) Y2 is a junction 
point of M, (2) YY: is in U, and (3) if f(Y¥i¥:) = 2:22, then 2,2 is a subset 
of an interval of M of degree s + 1. 

This process can be continued indefinitely. Thus there exists an infinite 
sequence of ares PY; , Yi¥2, Y2Y3, --- such that for each 7, we have 

(1) Y;is a junction point of M; 

(2) Y<¥Yis. is in U and is a subset of an interval of degree r + 7; 

‘3) f(¥,) = Z;, and the are Z;Z;,, is a subset of an interval of degree s + 1. 

But the following consideration leads us to a contradiction: Among the 
points Y;, Y2, --- let Y, and Y; be respectively the first and second of order 4. 
Then k — h is some integer z;;...; associated with a point P;;...,;in U. Now 
clearly Z, and Z,, being images of Y, and Y;, respectively, are of order 4. 
But then by Property II k — A is a sum of numbers of the form 2;;...: a8so- 
ciated with points in V. Since U-V = 0, P,;..., is not in V. 

Thus by Property I we have reached a contradiction. 


Douxe UNIVERSITY. 

















ite 








CONTRIBUTIONS TO THE THEORY OF GROUPS OF FINITE ORDER 


By OystTEIn ORE 


The present paper contains a number of results of diverse nature in the theory 
of finite groups. One may say that the guiding principle is the application of 
structure theory to the theory of groups. In two recent papers’ I have already 
shown that this method is very useful for various investigations in group theory. 
In the present paper this point of view is particularly important in the study of 
non-normal chains of subgroups of groups, a field which seems to be untouched 
untilnow. One of the main problems which is solved by this structural approach 
is the determination of all groups in which every chain of subgroups, with each 
term maximal, but usually not normal, in the preceding shall have the Jordan- 
Hélder property that the indices are the same in some order. 

Let us also make the following general remark. Groups are ordinarily 
defined by their elements, or, equivalently in structural terms, each subgroup 
is the union of cyclic groups, and hence the group properties are naturally 
stated by means of element properties. By dualizing this process one is led 
to the investigation of the properties of a group in relation to its maximal sub- 
groups and several of the results of this paper may be said to belong to this 
category. 

In the first chapter various properties of permutable groups are derived and 
particularly the existence of permutable decompositions is investigated. It is 
shown that if a group has the property that all maximal subgroups are per- 
mutable, they are all normal and the group is nilpotent. The concept of quasi- 
normality introduced in the preceding papers is studied further, and it is shown 
that in several cases normality and quasi-normality are identical concepts. In 
particular, no simple groups can contain quasi-normal subgroups. 

In the second chapter properties of normal decompositions as union and the 
dual decomposition as cross-cut are considered and the -relation to the repre- 
sentations of the group as a permutation group is pointed out. Here one also 
finds the determination of the ¢-group of a group. 

In the third chapter properties of arbitrary non-normal chains of subgroups 
are deduced. It is shown that to any complete chain there exists a chain with 
the same index type passing through any prescribed principal chain. This 
result reduces the study of various properties of such chains to the case of 
simple groups. 


Received March 24, 1939. 
‘Oystein Ore, Structures and group theory, | and II, this Journal, vol. 3(1937), pp. 149- 
174; vol. 4(1938), pp. 247-269. These papers will be quoted in the following as Ore I, II. 


431 














432 OYSTEIN ORE 


When this theory is applied to solvable groups in Chapter IV, it follows 
immediately that all consecutive indices in such groups are powers of primes 
and each subgroup is permutably contained in any immediately preceding 
group. The properties of maximal groups in solvable groups are particularly 
interesting. Each such group belongs to a unique normal subgroup which 
determines it except for isomorphism. Any two maximal subgroups are con- 
jugate or permutable, and if p divides the group order, there exists a maximal 
group whose index is a power of p. This last statement is equivalent to a result 
of P. Hall. 

In the last chapter one finds the solution of the problem of finding all groups 
in which any maximal chains have the property that the indices of two chains 
are the same in some order. It is shown that these groups must be solvable 
They form a class of groups which may also be characterized by various other 
properties, for example, the properties that all consecutive indices be primes, 
that there exist a complete principal chain, or that the subgroups and quotient 
groups have subgroups of every possible order. 


Chapter I. Permutability 
1. Permutable groups. In the following we shall consider the subgroups of a 


group G. When A and B are two such subgroups, we shall denote their union 
by A U Band their cross-cut by A / B. 

The two subgroups shall be said to be permutable if to every a; in A and}, 
in B there can be found a second pair a2 and bz such that a,-b; = 62-2. 

Let us recall the following properties of permutable subgroups:” 

If A is permutable with B and C, then A is permutable with A U C. 

A group is permutable with all its subgroups. 

If B > Band A is permutable with B, then B is permutable with A 1 B. 

If A > A and B D Band A and B are permutable, then AN BandA NB 


are permutable. 
One of the fundamental properties of the permutable groups is the 


Depekinp Rexation. Let A and B be permutable and C D A. Ther 
CN(AUB)=AU(CNB). 

One can express the condition for two groups to be permutable as follows: 

THEOREM 1. Let 

A = }'a,T, B = >0b;T 

be the co-set expansions of two groups with respect to a common subgroup T. The 
necessary and sufficient condition for two groups to be permutable is that there erié 
relations 
(1) aj-b; = by-a-t 
for every pair a; , b;. 


? Ore I, Chapter 2. 





an 





OWS 


mes 
ling 
arly 
hich 
con- 
mal 
sult 


Ups 
ains 
ible 
ther 
mes, 
‘lent 


ofa 
nion 


id b; 


NB 


Then 


exis! 





CONTRIBUTIONS TO THEORY OF GROUPS OF FINITE ORDER 433 


Proof. An arbitrary ain A and b in B may be written in the form a = a;-t, 
b = b;-t’, where ¢t and ¢’ belong to JT. When (1) holds, one obtains a-b = 
a;-t-b;-t’ = aj-by-t” = bj-a,-t’”’ = b’-a’, and the converse follows similarly. 
Now let A and B be permutable and let us write 


M=A UB, D=ANB. 


If then 

(2) B= >> dD 

is a co-set expansion of B with respect to D, then 
(3) M = 30 bA 


is the co-set expansion of M with respect to A. Even if A and B are not per- 
mutable, the right-hand co-sets in (3) are all distinct, and hence one always 
has the index relation 


(4) [((A U B): A) 2 [B:(A N B)] 


and the equality holds only when A and B are permutable. 
The correspondence 


(5) b:D = b:A 
is a one-to-one correspondence between the elements in the two quotient systems 
(6) AU B/A=B/ANB. 


From this correspondence one easily derives a set of others "’ 
let A 2A, B DB. 
If A is permutable with A N Band A 1 B and B is permutable with A N B 
and A 1) B, then 
AU (AN B)/A U (ANB) BU (BN A)/B U (BN A). 
If A U B is permutable with A and B, then 
AN (AU B)/AN (AU B)@BN (BU A)/BN (BU A). 


If A is permutable with B and B, and B is permutable with A and A, then the 
last two correspondences are identical and one also has 


AN B/ANBN(AUB)@=AUBU (AN B)/A UB. 


The correspondence (5) does not usually give a correspondence between the 
subgroups of the two quotient structures (6). We shall now analyze this side 
of the correspondence further. It is seen that if A = >> };A is a co-set ex- 


pansion of any subgroup of M containing A, then the multipliers 6; generate 
a group B containing D and such that the index relation 


(7) [A : A] = [B: D] 


*Ore I, Chapter 2. 








434 OYSTEIN ORE 






























holds. This group B may also be given explicitly 
(8) B=BNA, 


and it is seen to be permutable with A. 

Conversely, let B = >}° 6,D be some subgroup of B containing D. Then 
by (5) B corresponds to the co-sets }> 5;A. The smallest group containing 
these co-sets is obviously A = A U B, but the index relation (7) will not hold 
except when B is permutable with A. Now to A corresponds conversely as 
in (8) B = Bf (A U B), and hence B is a group in B/D permutable with 4 
and containing B. It is also the smallest such group because if B, D B is 


another, then one finds 


B,=BN (AUB) DBN (AU B) =B. 


In this manner the groups in B/D are distributed into systems, each subgroup 
corresponding to the least subgroup which contains it and is permutable with A. 
Let us finally observe that if B, and B, are subgroups of B/D permutable 
with A, then B, U B, and B, NM B, are permutable with it. In the case of the 
union it is obvious and for the cross-cut it follows from the representation 


B,N B= BN (A UB) N (A U B) 
= BN (A U (B/N (A U B,))) = BN (A U (B, N By). 





We can summarize these results as follows: 


THEOREM 2. When A and B are permutable groups, then there exists a strong 
structure isomorphism between the quotient structure A U B/A and that sub 
structure of B/A 1 B which consists of those subgroups which are permutable 
with A.* 


2. Permutable maximal groups. We shall prove here a few facts about the 
conjugate groups of permutable groups and show first 


Tueorem 3. Let A and B be permutable. Any conjugate of Ain M = A UB 
is then permutable with any conjugate of B in the same group. 


Proof. It is sufficient to prove that any conjugate of A is permutable with B. 
Now any such conjugate has the form bAb and the permutability of this 
group with B follows by transforming the relation a,b; = b2a2 by b. 

Another property is the following: 


THeoreM 4. Two permutable groups A and B cannot be conjugate in their 


union. 


‘A similar investigation has been made for structure of equivalence relations by Paul 
Dubreil and Mme. Dubreil-Jacotin, Propriétés algébriques des relations d’équivalent, 
Comptes Rendus, vol. 205(1937), pp. 704-706; Propriétés algébriques des relations d’équiv- 
lence; théortmes de Schreier et de Jordan-Hélder, Comptes Rendus, vol. 205, pp. 1349-1351. 


a 








up 
A. 
ble 





CONTRIBUTIONS TO THEORY OF GROUPS OF FINITE ORDER 435 


Proof. Let us suppose that there exists an element m = adobo in M such 
that (aobo)B(aob)) ' = A. Then one finds 


B = boBbo' = ay'Aa = A. 
This theorem has the following interesting application: 


TueoreM 5. If all maximal subgroups of a group G are permutable, then all 
maximal subgroups are normal. 


We shall show later that this is a characteristic property of the nilpotent 


groups. 
Theorem 4 can be extended as follows: 


TueoreM 6. Let A and B be permutable groups. If a subgroup B, of B is 
conjugate to a subgroup A, of A in A U B, then both are conjugate to a subgroup 
of ANB. 


Proof. Since one has (aobo)B;(aobo.)' = A,, it follows that boB,bo' = 
a) Aya C A nN B. 


3. Existence of permutable decompositions. We shall say that a group G is 
permutably decomposed if G = A U B, where A and B are permutable, and in 
this case we say that A and B are permutably contained in G. We shall now 
study some conditions for one group to be permutably contained in another.’ 
From the preceding it is obvious that if A is permutably contained in G, then 
all its conjugates have the same property. 

Now for the moment let A and B be arbitrary (not necessarily permutable) 
groups, and let 7,4 , 7, denote the indices of A and Bin M = A U B. Further- 
more, let nw , Mp», N4 , N» denote the orders of the corresponding groups. From 
the relations 


. Na . Mp 
Mu = ta*—'Np = tp*—'*Npd 
Nb Np 
follows that np divides ny/i, and ny/ig, and hence it divides ny/[i,4, is], 
where the bracket denotes the least common multiple. 
On the other hand it follows from the relation (4) that 


and hence we can state 


THEOREM 7. Let A and B have the indices i, and i, in A U B and let ny and 
np be the orders of AU Band A 1) B. Then 


* Investigations on this problem have been made particularly by E. Maillet: Sur les 
groupes échangeables et les groups décomposables, Bulletin Soc. Math. de France, vol. 28 
(1900), pp. 7-16. 














436 OYSTEIN ORE 


n n 
(9) a, ie eS 
[ts ’ tg] la‘ls 
and np divides the upper bound. 

The lower bound for np is attained if and only if A and B are permutable, 
Hence we can state 

Tueorem 8. Let A and B be subgroups with relatively prime indices in a 
group G. Then A and B are permutable and G = A U B. 

From this theorem follows immediately 

THeoreM 9. Let A be a group of prime power index p". Then A is permutable 
with every Sylow group S, corresponding to p and G = S, U A. 

This theorem shows that any subgroup of prime power index p" is permutably 
contained in G except possibly when G is a p-group. 

Theorem 9 may be extended as follows: 

Let G have a subgroup A of index n; and let N = n-m be a factorization of 
the group order into two relatively prime factors, where m divides n. If @ 
has a subgroup B of order n, then A is permutably contained in G and G = 
AUB. 

Let us finally mention a fact which will be useful later. 

TuHeoreM 10. Let A be a subgroup of prime index pin G. Then there exists 
a cyclic group {b| = B such that it is permutable with A and G = A U B,b’? CA. 

Proof. Let S, be a Sylow group of G corresponding to p. The cross-cut 
S, M A must then have the index p in S, since A and S are permutable. This 
implies also that S, M A is normal in S, and any element b in S, but not in 
S, M A will satisfy the conditions. 

4. Normal subgroups in permutable groups. An important problem in 
connection with permutable decompositions is the question when the existence 
of a permutable decomposition of a group implies the existence of a normal 
subgroup. 

THEoREM 11. Let A and B be permutable. Those elements ao in A for which 
bab * belongs to A for every b in B form a subgroup Ao which is normal in A U BS 


Proof. It is obvious that the elements ap form a group Ao and that Ap is 
invariant by transformation with elements of B. To show that Apo is normal 
in A, let a be any element of A and let us write @ = aaa’. Then for any b 


bab’ = bab™- bab. (bab) * = bab™-ag(bab)™*. 
Now let bab’ = asb.. Then one finds 
bab’ = (azb2)ao(a2b2) ' = a2-a9 -az' C A 
and d@ belongs to Ao. 


The following theorem has very useful applications. 


* Ore II, Chapter 3. 





obvic 


6 


of ne 


TW 








is 
al 





CONTRIBUTIONS TO THEORY OF GROUPS OF FINITE ORDER 437 


THEOREM 12. Let A and B be permutable and let By be a normal subgroup of B 
which is also contained in A 11 B. Then A contains a subgroup Ay D By which 
isnormalin A U B. 


Proof. The conjugates A; of A in A U B are all obtained when A is trans- 
formed by the elements of B. Since A contains By and By is normal in B, all 
conjugates A; also contain By). This shows that the normal subgroup A» of 
A U B which is defined by the cross-cut of all A; must also contain By . 

An important special case is 


THEoREM 13. Let A and B be permutable and B Abelian. If A 1 B is not 
the unit group, then A contains a normal subgroup of A U B. 


5. Application of a theorem of Frobenius. We shall now give another case 
in which the existence of a permutable decomposition implies the existence of a 
normal subgroup. The method of proof depends on the theorem of Frobenius 
that if k divides the group order, then the number of solutions of the equation 
r = eisa multiple of k. The method is analogous to a method used by Burn- 
side’ in the determination of all groups with cyclic Sylow subgroups. 


THEoREM 14. Let A and B be permutable groups such that A = |a} ts cyclic 
of prime power order p* and nz is relatively prime to p(p — 1). Then B is normal 
inM =A UB. 

Proof. The number of solutions of the equation 


(10) zr” "2 =e 


must be of the form r-p* '-ng, where r < p. The number of elements not 
satisfying (10) is then (p — r)p* ‘-ng , and we shall prove that r = 1 by showing 
that this number must be divisible by p — 1. 

There are p* '(p — 1) powers of a not satisfying (10). Similarly, each con- 
jugate of a gives rise to p* ‘(p — 1) different elements of order p*. Now any 
element in M can be written uniquely in the form m = d-b, where @ and 6 are per- 
mutable elements (powers of m) and where the order of d is a power of p and the 
order of bis a divisor of ng. In order that such an element shall not satisfy (10) 
it is necessary and sufficient that the order of a be p*, and hence it is contained 
as a generating element in a conjugate of {a}. Now any element 6 permutable 
with d is also permutable with all powers of a, and hence the elements not satis- 
fying (10) fall into disjoint classes each containing a number of elements divisible 
by p—l. 

By induction one proves that the number of solutions of z”” = eis equal 
top" ‘-n,. For? = a@ it follows that x"* = e has nz solutions, and this implies 
obviously that B is normal. 


‘ong 


6. Quasi-normality. On the basis of permutability conditions various types 
of normality may be defined. The property of being permutably contained 


7 W. Burnside, Theory of Groups of Finite Order, 2d ed., pp. 163-164. 











438 OYSTEIN ORE 


may be considered as a weak normality property. The concept of quasi- 
normality is the strongest type of normality short of actual normality. We 
say that A is quasi-normal in G if A is permutable with every subgroup of G.' 
This means that for any g in G and ain A we have 


n 


(11) g:a = a’-g". 
If A is quasi-normal in G, then any conjugate of A by any automorphism of G 


is also quasi-normal. 

Let 
(12) G = Digi 
be a co-set expansion of G with respect to A. If A is quasi-normal in G, then 
all g; must satisfy permutability relations of the form (11). It does not follow 
conversely, however, that A is quasi-normal when these hold. The property 
of having such a permutable representative system g; must therefore be con- 
sidered as a weaker normality condition. It follows from Theorem 10 that 
every group of prime index has a permutable representative system. 

By combining Theorems 13 and 10 we obtain 

THEOREM 15. Let A be a group of prime index pinG. Then G is permutably 
decomposable G = A U B, where B = \b} is cyclic of prime power order p*. 
Here either a = 1 and A N B = Eor A contains a normal subgroup of G. 

Let us finally deduce some properties of quasi-normal subgroups. We prove 
first 


THEOREM 16. Any maximal quasi-normal subgroup is normal. 

Proof. The union of two quasi-normal subgroups is again quasi-normal, and 
hence the union of a maximal quasi-normal subgroup A and one of its conju- 
gates is equal to G. This is, however, not possible according to Theorem 4, so 
A must be normal. A consequence of Theorem 16 is 

THEOREM 17. A quasi-normal subgroup of prime index is normal. 


One may define a group G@ to be quasi-solvable when there exists a chain 
G2D>ADBD..- DE, where each group is maximal and quasi-normal in the 
preceding. From Theorem 16 it follows that the chain must be a composition 
chain and each index a prime. 


THEOREM 18. A quasi-solvable group is solvable. 
From Theorem 16 one also concludes 


THEOREM 19. Any quasi-composition chain in a group is a composition chain 
and any quasi-normal group occurs in some composition chain. 


* Ore I, Chapter 2. 











yve 


hain 





CONTRIBUTIONS TO THEORY OF GROUPS OF FINITE ORDER 439 


It follows also that a simple group cannot contain any quasi-normal sub- 
groups. A theorem related to the preceding is the following theorem of 
~ . ul 
Frobenius: 


THEOREM 20. Let p be the smallest prime dividing the order of a group G. 
Then a subgroup of index p is normal. 


Proof. It is sufficient to show that such a subgroup A is quasi-normal. 
According to Theorem 15 there exists an element 6 permutable with A and such 
that b” is contained in A. Any element 6; not in A must have the order p 
with respect to A and one must have relations bi = a,-b' (k = 1,2,---, p — 1), 
where also 1 must run through the same numbers in some order. For any 
element a in A one therefore finds b;-a = a,-b"'-a = agb"? = a3b\’, and A is quasi- 
normal. 


Chapter II. Normal decompositions 


1. Properties of normal decompositions. We shall begin by recalling a few 
of the properties of normal union decompositions.” Any finite group G has 
normal decompositions 


(1) G=A,U A,U...UA,, 


where the A; are normally indecomposable in G; i.e., there exists no represen- 
tation 4 = B U C, where B and C are normal in G and proper subgroups of 
A. The condition for A; to be normally indecomposable in G is that it contain 
a unique maximal group M; normal in G. For those A; which occur in a de- 
composition (1) one finds that the quotient group 


(2) P; = Ai/M 


is a simple group, and we shall say that A; belongs to this simple group P. 
It is convenient to distinguish between two types of indecomposable groups A,. 
We say that A; is of Abelian type if P; is a cyclic group of prime order and A; is 
of non-Abelian type when P; is non-commutative. 

One can always suppose that a decomposition (1) is reduced; i.e., it contains 
no superfluous terms and no A; can be replaced by a smaller group also normal 
in G. Then the main theorem on irreducible decompositions (1) states that 
any other such decomposition G = B, U B, U ... U B,, contains the same 
number of terms and any one group in one decomposition may be replaced by a 
suitably chosen group in the other. 

Let us introduce the following notations. We write A; = A; U... U Aj, 
U A,,, U ... U A, and one finds that the groups C; = M; U A; are maximal 
normal subgroups of Gand G; = G/C; = P;. If Cis the cross-cut of all maximal 


*G. Frobenius, Uber endliche Gruppen, Sitzungsber. Akad. Berlin, 1895 (1), pp. 163-194. 
'© Ore II, Chapter 2. 








440 OYSTEIN ORE 


normal subgroups of G, then C = C; n...NHC = M,U.-.U Mw. The 
quotient group 

(3) L=G/C 

has been called the upper normal cover quotient of G. It is completely reducible; 
i.e., it is the direct product of simple groups L = R, U ... U R,, where R; 
= N,/C,N; = M, U.-.-U M,, U A; U Miu, U --- U M,, and one has 
the isomorphisms R; ~ G; = P;. This shows that except for isomorphisms 
the simple quotient groups P; to which the components A; belong are uniquely 


determined. 





2. Groups whose maximal subgroups are normal. We shall now study the 
groups in which all maximal groups are permutable. We have already seen | 





(Theorem 5, Chapter I) that this implies that all maximal groups are normal. ' 
In this case all the maximal groups must have quotient groups which are cyclic 
of prime order. This implies in turn that all the indecomposable A; in a normal 
decomposition (1) must be of Abelian type, and hence the upper normal cover , 
quotient (3) is Abelian and the direct product of cyclic groups of prime order. i 
We must also have C = 9, where ¢ is the ¢-group of G, since C was the cross- n 
cut of all normal maximal groups while ¢ is the cross-cut of all maximal groups in 


in G. The solution of our problem then follows from a theorem of Wielandt:" 
The necessary and sufficient condition for a group to be nilpotent, i.e., the direct 


product of its Sylow groups, is that the ¢-group contain the commutator group. n 
This leads immediately to the following characterization of a nilpotent group: re 
THEOREM |. The necessary and sufficient condition for a group to be nilpotent by 

is that all maximal subgroups be permutable, hence that they all should be normal. (5 
Proof. It is easily seen that every nilpotent group has this property since 

it holds for the Sylow groups. The converse follows from the fact that G/C = MF 

G/¢ is Abelian, and hence ¢ contains the commutator group. ry 

7 i 

3. Cross-cut decompositions. To the theory of decomposition given in §1 _ 

there corresponds a dual theory which one obtains by considering the quotient - 

groups G/A for normal subgroups A and introducing the operations as 

G/A UG/B=G/ANB, G/A NN G/B = G/A U B. the 

The dual of the decomposition (1) becomes the representation - 
6 

(4) 4, NAN... A, =E ©) 

indt 

of the unit group as the cross-cut of normal subgroups, where the A; are nor- grou 

mally cross-cut indecomposable in G; i.e., they cannot be represented as the (7) 

‘ 


cross-cut of larger normal subgroups of G. The necessary and sufficient con- 


“1H. Wielandt, Eine Kennzeichnung der direkten Producte von p-Gruppen, Math. Zeit., 
vol. 41(1936), pp. 281-282. 














or- 
the 


on- 


‘it., 











CONTRIBUTIONS TO THEORY OF GROUPS OF FINITE ORDER 441 


dition for A; to have this property is that among the various normal subgroups 
of G containing A; as a proper subgroup there exists a unique minimal one M;. 
We say again that A; belongs to the quotient group M;/A; = P;. When the 
A; in (4) are indecomposable in the cress-cut sense, none of them can be replaced 
by any larger normal group. 

Now let D denote the (lower) normal cover group of G, i.e., the union of all 
minimal normal subgroups of G. Since the decomposition (4) is supposed not 
to contain any superfluous components, A; cannot contain D, because D has a 
normal subgroup in common with any normal subgroup of G. But since there 
exists a unique minimal M; containing A;, one must have A; U P; = M; for 
every minimal normal subgroup P; of G not contained in A;. This shows that 
the P; to which A; belongs is isomorphic to a minimal normal subgroup of G. 
One also sees that A; contains all but one minimal group in a suitable repre- 
sentation of D as the direct product of minimal normal subgroups of G, and 
the A, can be defined as maximal groups with this property. 

It follows by duality that any two representations (4) have the same number 
of components, the minimal groups corresponding to these components are 
isomorphic in some order, and one component in one representation can always 
replace a suitably chosen component in the other. There are also certain cases 
in which the A; must be unique, analogously to the dual cases.” 


4. Representations by permutations. The preceding theory is closely con- 
nected with the problem of representing G as a permutation group. Let us 
recall that any transitive homomorphic representation of G may be obtained 
by expanding G in co-sets with respect to a subgroup A, 


(5) G= ps Agi ? 


and when the co-sets are multiplied on the right by the elements of G, one ob- 
tains a set of permutations of the co-sets forming a group P, homomorphic to 
G and isomorphic to G/N, where N is the greatest normal subgroup of G con- 
tained in A. We shall say that the representation P, is induced by A, and it is 
convenient to say that A and also P, belong to N. The degree of the transitive 
representation P, is k, where k is the index of A in G. 

Any intransitive representation of G may be obtained by aligning or forming 
the sum of transitive representations. Any representation has a corresponding 
set of subgroups 
(6) A,, Az, --:,Ae 
inducing the transitive components and each of them belongs to a normal sub- 
group of G 
(7) Ni, N2,---,N, 

” This short exposition of the properties of cross-cut decompositions is mainly a re- 


statement from Ore II, Chapter 2, §6. It has been repeated here because the presentation 
at one point was not quite clear. 











442 OYSTEIN ORE 


and the degree of the representation is k; + --- + k,, where k; is the index 
of A, in G. 

THEOREM 2. The necessary and sufficient condition for a permutation repre- 
sentation of G induced by the subgroups A; in (6) to be a true representation is that 
the normal subgroups N; in (7) to which the A; belong satisfy the relation 


(8) Nin.--NN=E. 


Proof. An element corresponding to the unit permutation must belong to 
all N; , and conversely. 

All transitive representations belonging to the same normal subgroup N 
shall be said to belong to the same class Cy. In each class there is contained 
a regular permutation representation induced by N itself. Any group A con- 
taining N corresponds to a set of systems of imprimitivity in this regular repre- 
sentation, and any normal group JN, is characterized by the property that its 
elements transform the systems of imprimitivity into themselves. The quotient 
group G/N, is then isomorphic to the permutations of the systems. 

The classes of permutation representation can be made into a structure by 


the definitions 
(9) Cy U Cu = Cyum , Cy nN Cu _ Cyn. 


We shall say that a class Cy is indecomposable if there exist no greater nor- 
mal subgroups L and M such that N = MN L. If N is decomposable, the 
group G/N can be represented isomorphically as the sum of a representation 
of G/M and G/L. 

We shall now discuss the true or isomorphic representations of G as a per- 
mutation group. We shall usually suppose that the representation is reduced; 
i.e., no N; in (8) shall contain the cross-cut of any set of the others. When 
such N; are omitted, the relation (8) still holds. In terms of permutations 
this reduction means that one omits components homomorphic to a sum of the 
others. 

From (9) follows that the structure of representative classes is isomorphic 
to the structure of all normal subgroups of G. When this is combined with 
Theorem 2, the decomposition theory indicated in §3 gives 


THEOREM 3. Any true reduced representation of a group as a permutation group 
is the sum of transitive indecomposable components. Any two such representations 
contain the same number of transitive components and one component in one repre- 
sentation may be exchanged with a suitably chosen component in the other. 


All the indecomposable components may be obtained by constructing the 
representation classes corresponding to the various subgroups A; occurring in a 
reduced decomposition (4). The number of transitive indecomposable com- 
ponents in any permutation representation is equal to the number r of inde- 
pendent minimal normal groups of G in the representation of the cover D as the 




















CONTRIBUTIONS TO THEORY OF GROUPS OF FINITE ORDER 443 


direct product 
D=P,U P,U...UP,. 


For many problems it is of importance to determine the true representations 
of smallest degree. Such a representation must be reduced. Furthermore, 
in each class Cy one must select the representation of smallest degree, i.e., the 
representation induced by a subgroup A belonging to N and having the smallest 
index in G. The effect of decomposing a class is usually to obtain smaller 
representations, and it is to be conjectured that a sum of indecomposable repre- 
sentations gives the absolute smallest degree. 

For Abelian groups the problem is readily solved. Here there is only one 
representation, namely, the regular one, in each class. If a class is decomposed 
N = M U L with the indices ky , ky, and k,, one obtains a shorter repre- 
sentation of degree k y +k: <ky. This shows that the shortest representation 
must be a sum of cyclic transitive parts and one finds without difficulty 


THEOREM 4. The degree of the smallest true representation of an Abelian group 
as a permutation group is equal to the sum of its invariants.” 


5. Applications. Ifa group G contains a subgroup H, then any true represen- 
tation of G contains a true representation of H. The transitive representations 
of G are induced by subgroups A not containing any normal subgroup of G. 
If dy is the degree of a smallest representation of H, then it follows that the 
index of any such A must be at least d,. From Theorem 4 follows that if G 
contains an Abelian group H, then the index of any subgroup A of G not con- 
taining normal subgroups is at least equal to the sum of the invariants of H. 
As a special case we have 


THEOREM 5. Let G be a simple group and H a maximal Abelian subgroup of G 
~ the invariants pj'. Then the index of any maximal group in G is at least 
pi’. 


If one could give some lower bound for the degree of a true permutation 
representation of a p-group, this would also give a lower bound for the indices 
of subgroups in a simple group from the group order. 

Now let A and B be permutable groups and let 


(10) B = >> dD 


be the co-set expansion of B with respect to their cross-cut. We form all 
products 


(11) a-b; = bj-a; 


A. Powsner, Uber eine Substitutionsgruppe kleinsten Grades die einer gegebenen Abel- 
schen Gruppe isomorph ist. After the completion of this paper a review of this paper 
appeared (Zentralblatt f. Math., vol. 19(1939), pp. 155-156) stating that it contained a 
theorem equivalent to Theorem 4. 











444 OYSTEIN ORE 


for a fixed element a in A and all generators b; in (10). The b; also denote 
D 


some of these generators. No two of them can be equal since b; = b; implies 
ab;a;* = abja;* or b;'-b; C A and b,; and b; would belong to the same co-set 
in (10). 

Each element a in A corresponds therefore to a permutation of the generators 
gi defined by the relation (11). This permutation can be assumed to be a 


oe Np , 
substitution on at most — — 1 letters since one can always choose b; = e. 
Np 


The group A is homomorphic to the substitution group thus defined. Those 
elements a in A which correspond to the unit substitution form a subgroup Ao 
for which bAob’ C A for every bin B. According to Theorem 11, Chapter I, 
this implies that Ao is normal in A U B. 

Such a normal subgroup Ap > E must exist if A or a subgroup of A cannot 


be represented isomorphically by a permutation group on = — 1 letters. Let 
/D 


us mention only one case. 
TueoreM 6. Let A and B be permutable groups. If A contains any element 


of prime power order p* such that p* = ~ , then A must contain a normal sub- 
D 
group of A U B. 


6. The ¢-group. The ¢-group of a group G is a characteristic subgroup 
defined as the cross-cut of all maximal subgroups of G. It may also be defined 
as the set of elements which can be omitted in any generating system for G. 
It is also the union of all those subgroups which are superfluous in any repre- 
sentation of G as the union of subgroups. A fundamental property of the 
¢-group is the basis theorem™ expressing that ¢ is the maximal normal subgroup 
such that if g: , --- , gs is any generating system of the co-sets in G/¢, then they 
also generate G. A special case is Burnside’s basis theorem for p-groups. 
From the basis theorem follows that the ¢-group of the quotient group (/¢ is 
the unit group, so that one cannot form repeated ascending ¢-groups. 

If A is a normal subgroup of G, then 


(12) Pa .. A n Pc. 


This follows from the fact that any generating system for G/¢, must also 
generate A/¢,. From (12) one concludes duns GS 4 MN os. Similarly, one 
obtains ¢4ys 2 4 U os. If A and B are relatively prime, one finds that the 
equality holds in the last relation. 

The ¢-group is nilpotent.” This fact may be considered as a special case 
of the following 


4H. Zassenhaus, Lehrbuch der Gruppentheorie, vol. 1, p. 45. 
16 See, for example, Miller, Blichfeldt, and Dickson, Theory and Application of Finite 


Groups, pp. 71-72. 

















‘inile 





CONTRIBUTIONS TO THEORY OF GROUPS OF FINITE ORDER 445 


THEOREM 7. Let A be a normal subgroup of G and N, the normalizer in G of 
some Sylow group S, of A. Then 


(13) G=N,UA. 


Proof. All the conjugates of S, belong to A and therefore have the form 
aS,a. This means that for any g in G there exists an a such that gS,g' = 
aS,a', org = a-ny, where n, belongs to N,. 

When A is the ¢-group, (13) reduces to G = N, and S, is normal. 

The ¢-group may be determined by means of this property. We observe 
first that any group has a maximal normal nilpotent subgroup N. This group 
may be obtained as follows: The union of all normal p-groups of G for any 
prime p is a maximal normal p-group N, and the group N is given by N = 
N, UN, U .-.-.- for the various primes p, q, --- dividing the group order. 

Since ¢ is nilpotent, we can conclude by means of (12) that 


(14) G DH Dee Dee: 


It remains therefore to determine the location of ¢¢ between N and¢@y. Ina 
nilpotent group the quotient group N/@y is Abelian and the union of groups of 
the type (p, p, --- ) for the various primes p,q, ---. In the following it is con- 
venient to say that the group G splits over the normal subgroup A if there exists 
a subgroup M such that 


(15) G=MUA, MN A=B, 


where B also is normal in G. 

Now let us consider the quotient group G/@y. From the definition of the 
¢group it follows that G/@y cannot split over ¢/¢y. But for any normal sub- 
group A not contained in ¢ such that N D A 2D oy the groups G and G/dy 
must split since for any maximal group M of G not containing A we must have 
the relations (15). Here B is normal in G since it is normal in M and also normal 
in A since the quotient group A/d¢y is Abelian. We can summarize these re- 
marks as follows: 


THEOREM 8. Let G be a finite group, N its maximal normal nilpotent group 
and $ and $y the ¢-groups of Gand N. Then N 2¢ Dox and o/dy is the maximal 
normal subgroup of G/dy contained in the Abelian group N/ox over which G does 
not split. ; 

This result may also be expressed as follows: 

THEOREM 9. Let G be a finite group. The necessary and sufficient condition 
for G noi to have a $-group is that the maximal normal nilpotent subgroup of G 
be the unit group or the union of Abelian groups of type (p, p, --- ) and in the 
last case G must split over all its minimal normal Abelian subgroups. 


Chapter III. Properties of chains 


1. Refinement of chains. A set of subgroups 
(1) GOA; D--- DAU DE 











446 OYSTEIN ORE 


shall be called a chain and the indices [A;_; : Ai] are the indices of the chain. 
Two chains are said to have the same index type or to be conformal when their 
indices are the same in some order. When each group in (1) is maximal in the 
preceding, the chain is said to be complete and the groups A;1 > A; are said 
to be consecutive. 

As usual we call (1) a composition chain when A; is a maximal normal sub- 
group of A,_, and a principal chain when A; is maximal among the normal 
subgroups of G contained in A,;. Similarly, by introducing the concept of 
quasi-normality one can define quasi-principal chains while a quasi-composition 
chain is a composition chain according to Theorem 16, Chapter I. 

In the following we shall compare two chains 


G = Ag = PF Pm Dee 2A, DA, = E, 


(2) 
G=B2B,D---DBiDB=E. 


This will be done by means of a weak form of the theorem of Jordan-Hélder 
which we have previously derived.” Let us recall first a few of the most im- 
portant facts. 


The two chains (2) are said to be cross-cut permutable when A; (i = 0,1, --- , 7) 
is permutable with all groups A,.. M B; (j = 0, 1, ---, s) and B; (j = 0, 1, 
. , 8) is permutable with all B;, N A, (¢ = 0,1, --- ,7r). The main theorem 


on such chains is: 
Any two cross-cut permutable chains may be refined into two new cross-cut 
permutable chains which are conformal. 


The refined chains may be given explicitly. They consist of the terms 
(3) A;.; = A; U (Ay-4 nN B)), By = B, U (By-1 Nn A)) 


and one has [A;j-1 : Ay,;]) = (Bj. : B;.il. 

To this theory there exists a dual: We say that two chains (2) are union 
permutable when A,_; is permutable with A; U B; (j = 0, 1, ---, 8) and By 
is permutable with A; U B; (¢ = 0, 1,---, 7). For such chains one has the 
same refinement theorem as above. In this case the refinements corresponding 


to (3) are 
(4) A; = A; 1 nN (A; U B;), B,., = By-1 U (B, Nn A)). 


2. Comparison with principal chains. We shall now apply the preceding 
results to the case where the second chain in (2) is a principal or quasi-principal 
chain. Then obviously B; is permutable with B,, M A; and B;-; is permutable 
with B; U A;. Furthermore, for any elements a; in A; and b; in B; we have 
a;-b; = b;-a? , and if b; belongs to Ai NM B;, then b; must have the same 
property. This shows that A; and Ai. M B; are permutable and similarly 
one sees that A,_; is permutable with A; U B;. This proves 


16 Ore I, Chapter 3. 


























CONTRIBUTIONS TO THEORY OF GROUPS OF FINITE ORDER 447 


THEOREM 1. Any chain is both cross-cut and union permutable with a principal 
or quasi-principal chain. 


In this case the preceding theory is somewhat simplified since it follows from 
the Dedekind relation that the refinements (3) and (4) are the same. 

From now on we shall study the case where the first chain in (2) is a complete 
chain while the second is a principal or quasi-principal chain. Then the second 
chain can be refined into a chain which has the same index type as the first 
and we have 

THEOREM 2. T'o any complete chain there exists a conformal chain (not neces- 
sarily complete) passing through any prescribed principal or quasi-principal chain. 


We conclude from this theorem 


THEOREM 3. A complete chain is never shorter than a principal or quasi- 
principal chain. 


A further important consequence of Theorem 2 is 
| 


THEOREM 4. Let n; (¢ = 1, 2, --- , 8) denote the indices of any principal or 
quasi-principal chain. It is then possible to enumerate the indices f\” of any 
complete chain in such a manner that n; = fife? --- f& (i = 1, 2, «++ , 8). 


Another point of interest about the refinement into a principal or quasi- 
principal chain is the following: If a group A; in the first chain is normal in 
A;_, and the second chain is a principal chain, then B;,; is found to be normal 
in B;-1. Hence when the first chain is complete and contains a certain number 
of cases in which one term is normal in the preceding, then the refinement into 
a principal chain must contain at least the same number of normalities. 


3. Chains of maximal length. Now let us turn to the case where the first 
chain in (2) is a chain of maximal length in G while the second remains principal 
or quasi-principal. In this case the refinement of the first into the second must 
also be complete and we have therefore 


THEeorEeM 5. To any complete chain of maximal length there always exists a 
conformal chain containing any prescribed principal or quasi-principal chain. 


Theorem 5 shows that one can determine all index types of maximal chains 
in a group from those of the quotient groups B;:/B; in any principal chain. 
It is possible to reduce the problem still a little further. It is known that the 
group B;_,/B; is the direct product of simple isomorphic groups and hence there 
exists a principal chain in the quotient group 


(5) B14 DByy D--- DByz DB;, 


where all quotient groups are simple, and hence the B;,; form part of a com- 
position chain in G. When Theorem 5 is applied to the quotient group B;_:/B; , 
one sees that to any maximal chain in it there exists another with the same index 








448 OYSTEIN ORE 


type passing through all B;,;. By means of the theorem of Jordan-Hélder we 
obtain therefore 

THEeorREM 6. To any chain of maximal length G D Ay D--- DA DE 
there exists a conformal chain passing through any prescribed composition chain 
GOB, D--- DBiDE. Hence all index types of maximal chains in G may 
be obtained by stringing together all such types of the simple quotient groups in a 
composition chain. 

This theorem shows that if m; is the maximal length of chains in any simple 
group B;_,/B;, then the maximal length of any chain in G is m, + m2 + --- 
+ ™,. 

Let us observe finally 

TuHeoreM 7. The necessary and sufficient condition that all complete chains of 
maximal length shall be conformal is that all the simple quotient groups in a com- 


position chain have this property. 
Chapter IV. Solvable groups 


1. Properties of chains. We shall now apply the preceding results to derive 
various properties of the solvable groups. We shall study in particular the 2 
properties of maximal groups in solvable groups. 

A solvable group may be defined as a group in which there exists a composition 











chain whose indices are primes. From this definition follows trivially n 
THEOREM 1. Ina solvable group all chains of maximal length are conformal. d 
The same result must obviously hold in any group in which there exists a P 

chain whose indices are all primes. : 
The solvable groups also have the characteristic property that there exists ( 

a ; si 

a principal chain } 

gr 

(1) SDh.2:>-: 9Ln38 

where the quotient groups L;_,/L; are Abelian of type (p, p,---). There (4 

even exist characteristic chains with this property. When this fact is com- ' 

bined with Theorem 4, Chapter III, one obtains su 
TuHeoreM 2. In a solvable group the index of any two consecutive subgroups on 

is a prime power dividing one of the indices in a principal chain. (5) 
From Theorem 9, Chapter I one concludes further: oh 
TuHeoreM 3. Ina pair of consecutive groups A > B in a solvable group, B is I 

permutably contained in A. oH 

“ : _ of tot 
2. Decompositions of groups. Let H be a subgroup of some group G. Itis of t 

convenient to say that H is decomposably contained in G if there exist two normal x 

subgroups N’ > N in G such that The 


(2) HUN =G, HNN =N. aco 











on 





Ss a 


ists 


ere 
om- 


ups 


B is 


It is 
rmal 








CONTRIBUTIONS TO THEORY OF GROUPS OF FINITE ORDER 449 


We shall call N’ the normal component of the decomposition (2) and say as 
before that G@ splits over N’/N. We say further that G splits regularly over 
N’/N if any other decomposition H, U N’ = G, H, f\ N’ = N implies that 
H and H, are conjugate in G. 

We shall now prove various theorems which are of importance for the following. 


THEOREM 4. Let M be a maximal subgroup in an arbitrary finite group G 
and let us suppose that G contains some minimal Abelian normal subgroup L of 
order p*. If L is not contained in M, then M is decomposably contained in G with 
normal component L 


(3) MUL=G, MNL=E 
and M is either normal or has p“ conjugates. 


Proof. The first relation (3) is obvious when L is not contained in M. The 
second follows from the fact that Mf L is normal in M and also in L since L is 
Abelian, and consequently M f/f L is normal in G and therefore equal to E. 
The last statement in Theorem 4 follows from 


THEOREM 5. A maximal group is normal or has as many conjugates as its 
index. 


Proof. If M is not normal in G, it is its own normalizer. 

It should also be noted in connection with Theorem 4 that if M in (3) does 
not contain any normal subgroup of G, the normal subgroup L is uniquely 
determined. If, namely, G = M U L,, Mf L, = E were another decom- 
position, where L; also is normal in G, then one sees that L; must be minimal 
normal and Abelian and one finds by the Dedekind relation L U L,; = L U 
(MN (L U L,)). Here M / (L U L)) is normal in M and also in L U 1; 
since this group is Abelian, and hence M NM (L U L,) would be a normal sub- 
group of G contained in M. 


THEeorEM 6. Let B be normal in A and let A split regularly over B 
(4) A=H UB, BNH=E 


such that H is its own normalizer in A. Then any group G in which A and B are 
contained normally will also split regularly 


(5) G=NUB, NN B=E, 
where N is the normalizer of H and also its own normalizer in G. 


Proof. Since all conjugates of H belong to A, we have for any g in G, 
gHg' = aHa', where a belongs to A. This gives g = a-n, where n belongs 
to the normalizer N of H in G and the first relation (5) is obtained. Since none 
of the elements of B can belong to N, the second follows. 

Now let us suppose that there exists some relation (5) for a subgroup N of G. 
Then one finds A = B U (Nf A), (NN A) N B= Eand NN A must be 
aconjugate of H. Since N / A is normal in N, it follows that N must be the 














450 OYSTEIN ORE 


normalizer of this conjugate of H and the decomposition (5) is regular. If any 
element of B transformed N into itself, then N would contain more than one 
conjugate of H, and this is impossible. 

Theorem 6 may be used to establish the existence of certain maximal groups. 
We shall apply it in the case where G contains the normal subgroups A > B, 
where we suppose that B is a unique minimal normal Abelian subgroup of order 
p* and type (p, p, --- ). Furthermore A/B shall be a minimal normal Abelian 
subgroup of G/B of order q’ and type (q, q, --- ), Where p ~ q. Then A splits reg- 
ularly over B with A = C U B,C N B=E, where C isa Sylow group correspond- 
ing to g. Let us show further that C is its own normalizer in A. We denote 
by B, the subgroup of B consisting of those elements which belong to the nor- 
malizer of C. Since B, = BN (C U B,), it follows that B, is normal in C U B,, 
and hence the elements of B; and C are permutable and B, belongs to the center 
of A. This center must be normal in G, so we have B, = Bor B, = FE. The 
first possibility is excluded since C would be a characteristic subgroup of A 
contrary to the assumption that B was the only minimal normal Abelian sub- 
group of G. It also follows that the number of conjugates of C is p*, and hence 
p” = 1 (mod q) since C is a Sylow group. 

We conclude from Theorem 6 

THEeoreM 7. Let G contain the normal subgroups A > B, where B is a unique 
minimal normal Abelian subgroup of G of order p* and type (p, p, --- ), while 
A/B is some minimal normal Abelian subgroup of G/B of order q° and type 
(q, 9, ---) witha # p. Then A splits regularly 


A=CUB, CNB=E, 


where C is isomorphic to A/B and is its own normalizer in A. The group G also 


splits regularly 
G=MUB, MN B=E, 


where M is the normalizer of C in G. Furthermore, M is maximal in G and the 
number of its conjugates is k = p“ = 1 (mod q). 


Proof. It remains only to show that M is maximal inG. If for some maximal 
subgroup M, one has M, > M, then M, = M U (M, / B) and it follows from 
the proof of Theorem 4 that M, N B = B or M, fi B = E, and this gives 
M, = Gor M, = M. 


3. Properties of solvable and nilpotent groups. It follows already from §! 
that in a solvable group G any maximal group M is permutably contained and 
its index is a power of a prime. This result may be improved as 


TuHeoreM 8. Let G be solvable and M any maximal group. Then M is nor- 
mally or decomposably contained in G and the index of M is equal to an index ina 
principal chain. 





Su 

















ue 


pe 


nal 


ves 


and 


nor- 
in a 





CONTRIBUTIONS TO THEORY OF GROUPS OF FINITE ORDER 451 


Proof. We shall say as before that a group M in G belongs to the normal 
subgroup N of G if N is the maximal normal subgroup of G which M contains. 
Now let M be maximal. In the group G/N the maximal group M/N must 
belong to the unit group. Let L/N be a minimal normal subgroup of G/N 
of order p*. From Theorem 4 it follows that M U L = G, LN M = N, and 
our theorem is proved. 

From Theorem 8 one obtains the following characterization of solvable 
groups: 


THEOREM 9. The necessary and sufficient condition for a group to be solvable 
is that in any complete chain any group shall always be normally or decomposably 
contained in the preceding. 


Proof. Theorem 8 shows that every chain in a solvable group must have 
this property. Conversely, when every chain has this property, G must contain 
normal subgroups, and the theorem follows by induction. 

Theorem 9 may be considered an analogue of the following characterization 
of nilpotent groups: 


THEOREM 10. The necessary and sufficient condition for a group to be nilpotent 
is that any complete chain be a composition chain. 


It is easily seen that every nilpotent group has this property. The converse 
follows from the fact that this condition on the chains implies that no subgroup 
can be its own normalizer, while in any group the normalizers of the Sylow 
groups are their own normalizers. 


4. Maximal groups. We shall now turn to the special properties of maximal 
subgroups in a solvable group G. We prove first 


THEOREM 11. Let G be solvable and N a normal subgroup. All maximal sub- 
groups of G belonging to N are conjugate, and conversely, all conjugate maximal 
subgroups belong to the same N. 


Proof. The last part of the theorem is obvious. To prove the first let us 
observe that there is no limitation in assuming N = E since otherwise one need 
only consider the quotient groups. Then M is a maximal group which does not 
contain any normal subgroup of G, and hence according to Theorem 4 there 
exists a unique minimal normal subgroup L of order p“ such that M U L = G, 
MfL=E£E. Now let A/L be a minimal normal subgroup of G/L of order gq’. 

We shall first have to show that p ~ g. Suppose, namely, thatg = p. Then 
A as a p-group must have a center. On the other hand, one has A = L U 
(M1 A). Since the center of A is normal in G, it follows that L must belong 
to it, and this implies that M / A is normal in G contrary to the assumption. 

We have therefore g  pand A = CU L,CNL = E, where C is a Sylow 
group of A corresponding to g. The conditions of Theorem 7 are satisfied and 
all maximal groups of G not containing Z must be normalizers of C or of one 
of its conjugates. 














452 OYSTEIN ORE 


This proof permits us also to state the conditions for the existence of a maximal 
group corresponding to a given N. 

Tueorem 12. Let N be a normal subgroup of the solvable group G. The 
necessary and sufficient conditions for the existence of maximal subgroups of G 
belonging to N are: 

(1) N must be normally cross-cut indecomposable in G. Then there exists a 
single minimal normal subgroup N,/N of G/N of order p*. 

(2) The group G/N, shall not contain any normal subgroup whose order is a 
power of p. 

Let us prove next 

THEOREM 13. Let G be a solvable group. To every prime p dividing the group 
order there exists a maximal group whose index is a power of p. 

Proof. This theorem is a consequence of Theorem 12. It may also be proved 
independently by induction with respect to the group order. Let B be a minimal 
normal subgroup. Since the theorem holds in G/B, we need only consider the 
case where the order of B is p*, while the order of G/B is not divisible by p. 
Furthermore, one can assume that B is the only minimal normal subgroup of G. 
Now let A/B be a minimal normal subgroup of G/B. Then A/B is Abelian of 
order q° and the conditions of Theorem 7 are satisfied. If C is a Sylow group 
corresponding to q in A, then its normalizer M in G is a maximal subgroup 
with the index p*. 

By repeated applications of Theorem 13 one obtains the following theorem 

° 17 
of P. Hall: 

Let G be a solvable group of order N = n-m, where n and m are relatively prime. 
Then G has subgroups of orders n and m. 

18 . ° 

Hall™ has also shown that any group with this property must be solvable. 
This implies that every group in which every subgroup has the property of 
Theorem 13 must be solvable. 

Hall has also obtained the following results: 

Any two subgroups of order m are conjugate. 

Any subgroup of G whose order divides m belongs to some subgroup of order m. 

The number of subgroups of order m is a product of prime power factors p*, 
each dividing an index in a principal chain and p* = 1 (mod q) for some prime 
factor q of m. 

We shall conclude our investigations on maximal groups in solvable groups 
by proving 

TueoreM 14. In a solvable group any two maximal groups are permutable 
or conjugate. 

17 P, Hall, A note on soluble groups, Journal of the London Math. Soc., vol. 3(1928), 


pp. 98-105. 
18 P. Hall, A characteristic property of soluble groups, Journ. London Math. Soc., vol. 


12(1937), pp. 198-200. 














rem 


‘me. 


ble. 


pups 


table 


1928), 


vol. 


’ 








CONTRIBUTIONS TO THEORY OF GROUPS OF FINITE ORDER 453 


Proof. We shall prove the theorem by induction with respect to the group 
order. Let M, and M; be a pair of maximal groups inG. If both M, and M, 
contain a minimal normal subgroup L of G, then M, and M, are conjugate or 
permutable since M,/L and M,2/L have this property. We may assume there- 
fore that M, and M, have no common minimal normal subgroup of G. Now 
let us suppose that 47; contains such a group L. Then according to Theorem 4 
we have G = M, U L, M2 N L = E, and consequently, M,; = L U (M,N M,). 
This representation of M, shows, however, that it is permutable with M,. 
In the final case where neither M, nor M, contain any normal subgroup of G, 
they both belong to the unit group E and are therefore conjugate according to 
Theorem 11. 

There are two extreme cases in Theorem 14. When all maximal groups are 
permutable, then we have already seen that the group is nilpotent. The case 
where all maximal subgroups are conjugate is solved by the following theorem. 


THEOREM 15. A group in which all maximal subgroups are conjugate is cyclic. 


Proof. Let M,,M,.,--- be the maximal groups. Since it is known that 
the elements belonging to these groups cannot be all elements of G, there must 
exist an element g not in any M; and hence g must generate G. 

This theorem may be considered a generalization of the well-known theorem 
that a p-group of order p“ which contains only one subgroup of index p must 
be cyclic. 


5. Chains in solvable groups. In all finite groups there exist normal com- 
pletely reducible chains G = C,; D Ci, D--- DC; DC; D E, where Cy./C; 
is the union of minimal normal subgroups of G/C;. In a solvable group the 
quotient groups C;_,/C; are Abelian groups in which every element has prime 
order. For these chains various interesting properties have been derived. 

In a solvable group one can also construct normal nilpotent chains 


(6) G = N, a N; 1 a ere a N2 ee N, —_ E, 


where each N,_,/N; is a normal nilpotent subgroup of G/N;. We have already 
observed that in any group there exists a maximal normal nilpotent subgroup 
M,. Similarly, G/M, has a maximal nilpotent normal subgroup M.2/M,, 
and by continuing this process, one obtains in a solvable group a unique maximal 
normal nilpotent ascending chain 


(7) G=M,>M,.,>---2M.2M, DUE. 
One can then prove the following: 


Let (7) be the maximal normal nilpotent ascending chain in a solvable group G 
while (6) is any normal nilpotent chain. Then for everyi M; D N;. 


From this result follows: 


The maximal nilpotent chain has the shortest length among the normal nilpotent 
chains. 











454 OYSTEIN ORE 


These theorems are quite analogous to the results on normal completely 
reducible chains and various other theorems also hold for both classes of chains. 

Let us now dualize the preceding theory. In any group there exists a smallest 
normal subgroup M, such that G/M, is a p-group. The cross-cut of all M, gives 
the smallest normal subgroup M such that G/M is a nilpotent quotient group. 
Obviously M is a characteristic subgroup. In M there exists a corresponding 
smallest normal group with nilpotent quotient group. Thus one obtains in any 
solvable group a maximal normal nilpotent descending chain 


(8) G=M,2M,1>--- 2M. OME. 


It follows then further: 

The descending and ascending maximal normal nilpotent chains (8) and (7) 
in a solvable group both have the same length and are shortest normal nilpotent 
chains. If G = Ni, D Nia D--- DN2 DM DP E is any shortest normal 
nilpotent chain in G, then one has for everyi M; DN; D M;. 

One can also consider normal Abelian chains 
(9) G=x2Ae-2DA, 2 -::--.2IA O8, 


where every quotient group A;;/A; is Abelian. Such chains must exist in 
every solvable group. But here one cannot usually define the maximal normal 
Abelian subgroup since the union of two normal Abelian groups need not be 
Abelian. One could avoid this difficulty by considering groups which are the 
union of Abelian normal subgroups. There exists, however, a smallest char- 
acteristic group with Abelian quotient group, namely, the commutator group. 
Correspondingly, one obtains the chain of commutator groups 


(10) G=(20¢,3.-.-3C. DB, 


and we can say: The commutator chain (10) is a shortest chain among the normal 
Abelian chains, and if (10) is the commutator chain and (9) an arbitrary Abelian 
chain, then for every i A; DC;. 

Chapter V. Groups with conformal chains 


1. Dispersible groups. We shall now study various classes of special solvable 


groups. 
Let N be the order of a group G and let 


a 
(1) N = pi'p2? --- pr’; D> po>-:->P 
be its prime factorization with the prime factors arranged in decreasing order. 
A type of group which occurs in various investigations is that in which there 
exists a chain of normal subgroups in G 
with the corresponding orders 


(3) 1, pi’, pi’ p2?, --- , N. 








are 
ind 








ible 


> Pr 
der. 
rere 








CONTRIBUTIONS TO THEORY OF GROUPS OF FINITE ORDER 455 


Such groups shall be cailed dispersible groups. In certain cases one can also 
introduce the group orders (3) in reverse order of the primes 


(4) 1, pr’, pr’ Pra’, ++: , N. 


We shall say then that G is a reverse dispersible group. 

Both types of dispersible groups are obviously solvable and form a type of 
group including the nilpotent groups. It follows from the results of Hall men- 
tioned above that if K; > K; in a chain (2), then K; splits regularly over K; . 
The chain (2) is also a characteristic chain of the group. 

Now let G@ be any group. It is easily seen that the union of two dispersible 
groups is again dispersible, and hence there exists a unique maximal normal 
dispersible subgroup. This shows the existence of an ascending chain of maximal 
normal dispersible groups 


in any solvable group. These groups are all characteristic subgroups. Sim- 
ilarly, one sees the existence of a minimal normal group D such that G/D is a 
dispersible group and one obtains also a descending chain of maximal normal 
dispersible groups 


(6) G = D.. a D,. eo or re dD a dD, pe E. 


As before one finds that both chains (5) and (6) have the same length and are 
shortest chains among all normal dispersible chains. If G = C, DCy_,; D.-.-- 
2C, D> C; > E is any shortest dispersible chain, then one finds as before that 
D; 2 C; D D;. 

Among other properties let us mention that any subgroup and any quotient 
group of a dispersible group is again dispersible. 

The following theorem does not hold for reverse dispersible groups: 


THEOREM 1. The necessary and sufficient condition for a group to be dispersible 
is that there exist a complete chain E C A, © Ap C--- C Ay = G, where the 
indices are primes in non-increasing order. 


Proof. The theorem follows by induction with respect to the group order. 
It is then sufficient to prove that G contains a normal subgroup of order pr' 
because one can apply the same argument to the quotient groups. Now ac- 
cording to assumption A;_; contains a normal subgroup of order pf' and this 
subgroup must be unique and a characteristic subgroup. But then it must also 
be normal in A, since A;_; is normal in A; according to Theorem 20, Chapter 1. 


2. Groups with conformal chains. We shall say that the chains in a group G 
are conformal when all the complete chains in G have the same length and 
index type. 











456 OYSTEIN ORE 


We shall first study the solvable groups with conformal chains. These 
groups may be characterized as follows: 

THEOREM 2. The necessary and sufficient condition for a solvable group to have 
conformal chains is that the index of consecutive subgroups always be a prime. 

Proof. The definition of a solvable group shows that there exists a chain 
in which all indices are primes; hence all complete chains must have this property 
if they are to be conformal. 

A consequence of this theorem is 


THEOREM 3. The complete chains in a nilpotent group are conformal. 
Another consequence of Theorem | is 


THeoreEM 4. In a solvable group the condition that all complete chains shall 
have the same length implies that they are conformal. 


Proof. The length of the chains must be equal to the number of prime factors 
in the group order, since there always exists one chain with this property in a 
solvable group. 

We shall now proceed to the deduction of the following important fact: 


THeoreM 5. Every group with conformal chains is solvable. 


Proof. Let us suppose that the theorem were not true. Then there would 
exist simple groups with conformal chains. Among these we choose one of 
minimal order. Such a simple group G would have the property that all sub- 
groups were solvable groups with conformal chains. 

Now let M be a maximal group in G. We shall show that the index m = 
{G : M] is a prime number. Let us suppose, namely, that m is composite and 
that m is divisible by p*, a 2 1, while the order of M is divisible by the prime 
p to the power p*. A Sylow group S, of G corresponding to p has complete 
chains of length u + @ and all indices equal to p. 

We construct two complete chains in G. The first begins with the u + a 
terms in S,. The other is obtained by completing a chain in M to a chain in G 
by adding the group @ with the index m. Both chains shall have the same in- 
dices in some order. But since the first chain contains » + a indices p, while 
the second only contains u of them, one must have a = 1 and m = p. 

This shows that all maximal groups in our simple group G must have prime 
indices. Furthermore, G must have at least two maximal groups with different 


prime indices. This follows from the 

Lemma. Any group contains maximal subgroups whose indices are not divisible 
by any prescribed prime. 

Proof. Any maximal group M containing a Sylow group corresponding to p 
has an index not divisible by p. 














se 


all 


ors 
1a 


uld 








CONTRIBUTIONS TO THEORY OF GROUPS OF FINITE ORDER 457 


The contradiction proving Theorem 5 is now a direct consequence of the 
following theorem: 


THEOREM 6. Let G be a finite group containing a maximal group with prime 
index p. If pis not the greatest prime dividing the group order, then G is composite. 


Proof. G can be represented as a substitution group of degree p. This 
representation cannot be a true representation since any element whose order 
is a prime greater than p must correspond to the unit permutation. 

It should be observed that when p is the greatest prime dividing the group 
order, the group may be simple. An example is the alternating group on 5 
elements of order 60 which has a subgroup of order 12. 

We shall now mention a few other properties of groups with conformal chains. 
When Theorem 1 is combined with Theorem 13, Chapter IV, one obtains 


THEOREM 7. In a group with conformal chains any arrangement of the prime 
factors of the group order represents the indices of some chain. 


This implies further 


THEOREM 8. A group with conformal chains has subgroups of every order 
dividing the group order. 


It is also seen that every subgroup and every quotient group has the same 
property. 
By combining Theorem 7 with Theorem 1, one also sees: 


THEOREM 9. A group with conformal chains is dispersible. 


3. Groups in which there exists a principal complete chain. We shall now 
prove a theorem which brings the groups with conformal chains in connection 
with another important type of group. 


THEOREM 10. The necessary and sufficient condition that a group G shall have 
conformal chains is that G contain a principal chain which is complete; i.e., the 
indices in a principal chain shall be primes. 


Proof. We suppose first that such a principal chain exists. Then G is 
solvable and from Theorem 4, Chapter III it follows that all indices in any 
complete chain must be primes. Theorem 2 shows then that G must have 
conformal chains. 

To prove the converse let G be a group with conformal chains. To prove 
that the indices in a principal chain are primes it is sufficient to show that G 
contains one normal subgroup of prime order, because the same argument may 
then be applied to the quotient groups. 

According to Theorem 9, G contains a normal Sylow group S, of order py', 
where p; is the largest prime dividing the group order. The p;-group S; has 
a center C; which is a characteristic subgroup of S,, and hence normal in G. 











458 OYSTEIN ORE 


This shows finally that G contains a minimal normal Abelian subgroup LZ; of 
order pi and type (p: , p:, --- ) contained in the center of S, . 

Now let K denote a subgroup of G of index pf' and order N; = pz" --- pr’. 
All such subgroups K are conjugate and for any of them one must have 


We consider the subgroup 
(8) G, => K U Ly ’ K nN ly = E 


of order p{-N;. Since G; also has conformal chains, it must have subgroups of 
every order according to Theorem 8. Let H denote a subgroup of G;, of order 
~i-N,. According to a previous remark we can always choose a suitable con- 
jugate of H such that H > K. By the Dedekind relation it follows from (8) 
that H = K U (H 1. L,). Here the groups K and H f= L, are permutable 
and the order of H M L; is p,. We shall show that H /M L, must be normal 
in G. First, this group is normal in H and hence H / L, is left invariant by 
transformation with any element of K. Secondly, H M L, belongs to the center 
of S, and from (7) it follows that H M L, is normal in G. 

The proof of Theorem 10 shows incidentally that a group with conformal 
chains must have a principal chain 


(9) ECN,CN,C.--- CM =G, 


where the indices are primes in non-increasing order. 


4. Construction of groups with conformal chains. Now let @ be a group 
with conformal chains and (9) some complete principal chain. The index 
{Nisi : Ni} = pis a prime and the group of automorphisms is cyclic of order 
p — 1. Any inner automorphism of G defined by some element g will induce 
some automorphism in N,;/N;. If this automorphism is the unit auto- 
morphism for all 7 and for all elements g, then N;4:/N; belongs to the center of 
G/N; and G must be nilpotent according to a well-known criterion for nilpotent 
groups.” This gives for instance 


THEOREM 11. Let G be a group with conformal chains. If the congruence 
p = 1 (mod q) is not satisfied for any pair of primes p and q dividing the group 
order, then G is nilpotent. 

The criterion for nilpotent groups also gives the following important property 
of groups with conformal chains: 


19See pp. 166-167 of the reference in footnote 7. 

»” This theorem may also be considered as a consequence of a result of G. Zappa, Sui 
gruppi supersolubili, Rendiconti del Sem. Mat. di Roma, (4), vol. 2(1938), pp. 323-330. 
In this paper, which appeared after the completion of my paper, it is shown that a group 
with complete principal chains (supersolvable group) has a nilpotent commutator group. 
The proof given above is considerably simpler than the proof given by Zappa. (Added in 
proof.) 














of 


pup 
dex 
der 
juce 
ito- 
r of 
rent 


ence 
‘oup 


erty 


Sui 
-330. 
roup 
‘oup. 
ed in 








CONTRIBUTIONS TO THEORY OF GROUPS OF FINITE ORDER 459 


THEOREM 12. If G is a group with conformal chains, then its commutator group 
is nilpotent. 


Proof. Since the group of automorphisms of N;,:/N; is Abelian, any com- 
mutator in G must induce the identical automorphism in all such quotient 
groups. Furthermore, a principal chain in the commutator group C is given by 


ESCNN,SCNN.S:---SCcNM=C. 


But an element c in C must also induce the unit automorphism in Ni, N C/N; 
fC, hence C is nilpotent. 

Theorem 12 states that every group with conformal chains can be obtained 
from a nilpotent group C by extension with an Abelian factor group G/C. Not 
all such groups will have conformal chains or a complete principal chain, but 
this property gives an interesting idea of the generality of such groups. 


5. Groups with subgroups of every possible order. We shall now turn to 
another characterization of our groups with conformal chains. Let us say for 
short that a group G has subgroups of every possible order if it contains a subgroup 
of order n for every n dividing the group order N. It follows from the char- 
acterization of solvable groups by P. Hall that such groups must be solvable. 
It is also clear that they must have subgroups of index p for every prime p 
dividing the group order. This implies, as we have seen before, that certain 
of the indices in a principal chain must be equal to these primes; in particular 
G must have a normal subgroup of index p,, where p, is the smallest prime 
dividing the group order. 

It would be an interesting problem to determine all groups having subgroups 
of every possible order, but in spite of the great limitations imposed on the 
group by this condition it seems rather difficult to obtain a simple character- 
ization of such groups. This remains true even if one supposes that all sub- 
groups shall have this property. In this case one can, however, prove 


THEOREM 13. Let G be a group such that G and every subgroup of G have sub- 
groups of every possible order. Then G is dispersible. 


Proof. G must have a subgroup G,; whose index is p, and hence G; is normal 
inG. When the same argument is applied to G,, the theorem becomes a conse- 
quence of Theorem 1. 

Theorem 13 implies that G has a normal subgroup whose order is p,. By 
induction one concludes 


THeoreM 14. Let G be a group such that ewery subgroup and every quotient 
group of G have subgroups of every possible order. Then G is a group with con- 
formal chains, and conversely. 

In connection with this theorem it would be of interest to know whether the 
condition on the quotient groups is necessary. If it is needed, it might be shown 
through the construction of a suitable example. 





460 OYSTEIN ORE 


The investigations of this last chapter may be recapitulated as follows: 

There exists a type of solvable groups which is characterized completely by 
each of the following properties: 

(a) Groups with conformal chains. 

(8) Groups in which all consecutive indices are primes. 

(y) Groups with a complete principal chain. 

(6) Groups such that all subgroups and quotient groups have subgroups of 
every possible order. 

Many very interesting other types of groups are included in this general type 
of groups, and I hope to return to some of these studies upon a later occasion. 


YaLe UNIVERSITY. 











Ips of 


| type 
310N. 





INVARIANTS OF CERTAIN STOCHASTIC TRANSFORMATIONS: 
THE MATHEMATICAL THEORY OF GAMBLING SYSTEMS 


By Paut R. Hatmos 


Introduction. The ‘Regellosigkeit” principle of von Mises has been shown 
to correspond in the mathematical theory of probability to the fact that certain 
transformations of infinite dimensional Cartesian space into itself are measure 
preserving. It is the purpose of this paper to investigate the behavior of such 
transformations on more general spaces. The theorems at the basis of this 
work are stated in the first section and applied to obtain results concerning the 
existence and independence of “Kollektivs” in the second and third sections. 
In §§4, 5, and 6 certain invariants of the transformations considered are ob- 
tained. Previous results on these transformations are shown to be special 
eases of these invariance theorems. 


1. Preliminary definitions and theorems. In this section we shall define the 
concepts and state the theorems which are the basis of all the work of the 
later sections. 

DerFIniTIon 1. A collection §; of sets in a space Q is a field if E, € & and 
BE, ¢ §: implies E, + Eze F: and E, — E,\E, hi.’ 

DEFINITION 2. A collection %; of sets in a space Q, is a Borel field if B, is a 
feld and if FE; «B, (j = 1, 2, --- ) implies >> E; eB). 

j=l 

DEFINITION 3. <A probability measure is an additive, non-negative set func- 

tion P,(Z) defined on a field §; in a space 2,, with P;(Q;) = 1, such that 


P,(>. E;) = > P,(E,) whenever {E;} is a sequence of disjunct sets belonging 
j=1 j=1 
to : whose sum is also in fi . 

DeFINiTIon 4. A space 9; in which a probability measure P; has been 
defined on a Borel field B; is a probability space. 

DeFINITION 5. A measurable set in a probability space Q, is a set E such that 
E € a . 

DerFinition 6. Let Q, be a probability space and Q; a space on which there 
is given a Borel field 8; of measurable sets. Let $(x) be a single-valued func- 
tion whose domain is 2, and whose range is in Q;. (zr) is a measurable function 
ifthe set E of points x ¢ 2; for which ¢(z) is in E’ © Q; is measurable whenever 
B’ is.’ If ¢(x) is real valued, it is measurable if for every real number \ the 


Received May 2, 1938. 

1 All the fields used in this paper will also satisfy the condition that if # « %,, then 
CE ¢ §: , where CE is the complement in Q of the set E. 

* The symbol {¢(z) « E’} will be used to denote EZ. 


461 





462 PAUL R. HALMOS 


set {¢@(z) < A} is measurable. The Borel field of sets on the real line is taken, 
in this paper, as the collection of Borel sets. 

DeFIniTION 7. If ¢(z) is a real-valued measurable function on Q , the func- 
tion F(A) of the real variable A, F(A) = Pi{¢(x) < A}, is its distribution function. 

DeFINITION 8. Two measurable functions ¢ and ¢’ (defined on Q, and Q;, 
respectively) whose ranges lie in the same space 2 have the same distribution 
if, for every E” «Bi, PifgeE”} = Pi{¢’eE”’}. Thus, in particular, two 
real-valued measurable functions have the same distribution if and only if 
their distribution functions are identical. 

DeFiniTION 9. The class of all measurable functions, with ranges in some 
fixed space, but not necessarily all defined on the same space, such that every 
two have the same distribution is a chance variable. Any member of this class 
is a representation of the chance variable. 

So far we have defined a single chance variable ¢, in isolation from all other 
chance variables. If ¢’ is another chance variable (with range in the range 
space of ¢), our definition enables us to answer the questions “What is the 
probability that ¢ « Z?” and “What is the probability that ¢’ « £’?”, but it 
does not give an answer to the question ‘‘What is the probability that both 
@¢E and ¢' ¢«E’?” Since chance variables usually present themselves not 
singly but in sets and are connected with each other in rather special ways, 
we are led to the following considerations. 

Associated with every probability space Q, there is another space 2 defined 


as follows. Let Q be the space of all infinite sequences w = {2, 22, --> |, 
where x; €Q; (j = 1, 2, ---). Let S be the smallest Borel field which contains 
every set determined by conditions of the form x; « E;, EF; «SB; (j = 1, ---, n). 


Until we define a probability measure on 8, we may not consider 2 as a prob- 
ability space. We make, however, the following definition. 

Derinition 10. If a probability measure P is defined on the Borel field 8 
in Q, the probability space Q is a stochastic process associated with Q, .° 


DEFINITION 11. Let a, --- , a be any finite set of subscripts. The set £ 
is a cylinder set over Ya, , +++ , 2a, if, whenever we E,w = {21, 22, --- }, then 
any point w’, obtained from w by altering the coérdinates z,, , --- , 2a, only, 
is also in E£. 

The collection of all measurable cylinder sets over x4, , --- , Za, forms a Borel 
field Ba,,....a, & B. 


Derinition 12. If two probability measures are defined on the Borel fields 
B’ and B” respectively, B’ € B, B” € YB, they are coherent if they assign the 
same values to sets common to ¥’ and 8”. 

In terms of the preceding three definitions we are now able to formulate a 
mathematical description of at least one of the ways in which chance variables 
occur in physical problems. 

DEFINITION 13. A sequence, finite or infinite, of chance variables ¢, (with 


’ This is not the most general definition of stochastic process, but it is the one that is to 
be used exclusively in this paper. 














en, 








INVARIANTS OF CERTAIN STOCHASTIC TRANSFORMATIONS 463 


ranges all in the same space 2) is stochastic if the following conditions are 
satisfied. 

(i) To every set of conditions of the form ¢;«#;, EZ; «Bi (j = 1, ---,n) 
there corresponds a number P,,(¢; € E; , --- , on € En). 

(ii) A unique probability measure may be so defined on the Borel field 
$B, = Bi,..... that the set {x « E,} --- {z, € £,} has measure 


P,(o1 « E; 9 eee Pn e E,). 


(iii) The probability measures on the Borel fields 8; (j7 = 1, 2,---) are 
coherent, each with the others. 

(iv) The function‘ z;(w) is a representation of ¢; (j = 1, 2, ---). 

The following is a fundamental theorem on the representation of stochastic 
sequences of chance variables. It was proved in the case where Q, is the real 
line by Kolmogoroff’ and in the general case by Doob.° 


THEOREM 1. (riven a stochastic sequence {¢,} of chance variables, a unique 
probability measure P may be so defined on B that P is coherent with each P, 
(n= 1,2,---). 

Let 2 be a probability space, and for each n let the chance variable ¢, , 
with domain and range on &, , take the value z at the point x e€Q,. It is well 
known’ that if to every set of conditions of the form ¢; ¢ E;, E; « 8: 
(j = 1, ---,m) we assign the number P,(¢ ¢ £,) --- Pi(@, ¢ E,) the sequence 
\¢,} is stochastic. 

THEOREM 2. (riven any probability space 2; , a unique probability measure P 
may be so defined on that a set determined by the conditions x; «E;, E; ¢«B; 
(j = 1, --- ,) has the measure P,(¢; € E,) --- Pilg@n € En). 


DEFINITION 14. A system is a sequence {f,} of measurable functions on the 
stochastic process 2 satisfying the following conditions. 

(i) fiw) = 0 or else fiw) = 1. 

(ii) fx) (n > 1) depends only on 2, --+ , Zn-1. 

(ili) f.(w) (n = 1, 2, --- ) takes only the values 0 and 1. 

(iv) P(lim sup f.@) = 1) = 1." 


‘ z;(w) is the function which at the pointw = {z,, 2, --- } takes the value z; . 
*'X, p. 27. (See bibliography at the end of this paper.) 
‘VI, §1. 


’Saks, XII, Chapter 3. The property referred to is the conclusion of Theorem 1 for a 
finite sequence of chance variables. 

*A gambling system has been so defined by Doob, IV, p. 365. Essentially the same 
definition was suggested, independently, by Birnbaum and Schreier, I; Wald, XIII; and 
Huntemann, IX. This definition describes mathematically our intuitive idea of a gambling 
system. The player will bet on the outcome of the n-th play if f, = 1, and will refrain from 
betting otherwise. Condition (ii) states that at each stage the player knows the results 
of the preceding trials only. Condition (iv) is merely a mathematical convenience which 
ensures that the probability is one that the player bet an infinite number of times. 











464 PAUL R. HALMOS 


With every system on 2 we associate a transformation 7’, defined almost 
everywhere on @ and taking © into itself, as follows. 
DerFINiITION 15. Let a,(w) be the lowest integer satisfying the equation 


an 

> fil) = n. Then, by condition (iv) in the definition of a system, with 

, 

almost every w there is associated an infinite sequence {a,} of subscripts. The 

° 7 P , , 

system transformation T is defined by T(x, 22,---) = {a1, %2,---} = 
9 

{Lay Lay } vee}. 


DertniTIon 16. Let 2, and 9 be probability spaces and let 7’, be a trans- 
formation with domain Q, and range in Q;. 1; is measure preserving if for 
every measurable set E’ © 9, the set E = {7;(w) ¢ E’} is measurable and 
P,\(E) = P;(E’). We shall also write T~'(E’) for E. 

DerFIniTION 17. If the real-valued measurable function ¢(x) defined on the 
probability space 2; is summable,” ¢@ dP, is the expectation of the chance 
2; 
variable represented by ¢. Usually when we work with an integral on the 
whole space, we shall omit the range of integration in the symbol. We write 

E(¢) for the expectation of ¢. 

DeFINniTION 18. Let Q be a stochastic process, n a positive integer, and A 
a measurable set, A GQ. The set function P(AZ), E ¢Q,....,, , is a probability 
measure on %;....., that vanishes whenever P(Z) = 0. Hence, by a well known 
theorem on absolutely continuous measures," there exists a non-negative sum- 
mable function on 2, uniquely determined except for a set of measure zero and 
depending on the coérdinates x , --- , 2, only, say P(x, --- , 2, ; A), such that 


P(AE) = [?@ , +++ ,2n 3A) dP for every set Fe By,....,. Pla, --+ ,2%n; A) 
E 


is the conditional probability of A for given 2, ---,2n." 


DEFINITION 19. A stochastic process for which P(a, --- , n-1; In € E) is 
almost everywhere independent of 2, --- , 2,1 for every n = 1, 2, ---, and 
every E ¢, is independent. (A stochastic sequence of chance variables will 
be called independent if the corresponding stochastic process is.) 

DeriniTion 20. An independent stochastic process 2 for which the z,(@) 
all have the same distribution is stationary. 

It is readily seen that if {¢,} is a stochastic sequence of independent chance 
variables, P(g « E, , --- ,¢n€ En) = P(g e Bi) --- Pn € E,) (for n = 1, 2, --: 
and E;«%: (j = 1,---,m)), and conversely. 

In terms of our definitions we can now state the following fundamental 
theorems. 


THeoreM 3. If Q(E) is an additive, non-negative set function defined on a 
x x 


field § in a space 2,, with Q(Q:) = 1, and of Q(>> E) = dX QEB;) whenever 
j=l j=l 


® Theorem 1 shows that the transformed space ’ is a stochastic process. 
For integration in abstract spaces see Saks, XII, Chapter 1. 

1 See Saks, XII, p. 36. 

12 Kolmogoroff, X, Chapter 5. 








or 


ist 
Th 
= 














enw) 


ance 


ental 


on @ 


>never 





INVARIANTS OF CERTAIN STOCHASTIC TRANSFORMATIONS 465 


{E,,} ts a sequence of disjunct sets belonging to § whose sum is also in §1 , then 
P(E) may be so defined on the smallest Borel field B, including § iy: that it becomes 
a probability measure that coincides with Q on sets of %: . 


THEOREM 4. Let Q be an independent, stationary, stochastic process correspond- 
ing to a probability space Q, and let E; be any measurable set, E; © 2. Then if 
g(x) is the characteristic function of E, on Q, , we have 


lim . > g(x;) = P,(E;) 


neo MN j7=1 


14 
almost everywhere on Q. 


THEOREM 5. On an independent, stationary, stochastic process every system 
transformation is measure preserving.’ 


2. Application to “Kollektivs”. Much work has been done in recent years 
on refining the definition of “Kollektiv” as formulated by von Mises.” In 
this section we shall derive a consequence of Theorem 5 which includes some 
of the results concerning the existence of certain “admissible numbers” or 
“Kollektivs”.” 

DEFINITION 21. Let Q(£) be an additive, non-negative set function defined 
on a field §; in a space Q; , with Q(Q,) = 1. To every set E © Q, make corre- 
spond the numbers 


Q*(E) = gl.b.Q(E:) and Q,(E) = Lub. Q(B). 
E\DE FE, CE 
E,¢ $1 £,¢€F1 
Let 3, be the collection of sets such that Q*(Z) = Qs(E). It is readily verified 


that 3; is a field, $1 D §1 , and we define for every EF « 3,, Q(E) = Q*(E) = 
Q.(E£). 1 is the collection of Jordan measurable sets with respect to i . 


THEOREM 6. Let P; be a probability measure defined on a denumerable field § 
ina space 2. Let 31 be the collection of Jordan measurable sets with respect 
to §:, and B, the Borel extension of 3:1. Since P, may be defined on 3; and 
then on B, coherently with its definition on §, so that it is a probability measure, 
2, becomes a probability space. Let Q be the independent, stationary, stochastic 
process associated with 2; and let T;, Tz, --- be any sequence of system trans- 
formations on Q. Finally, write T,(2,, 2, --- ) = {at ,x22,--- }. Then there 
is a set Z of measure zero on 2 such that 


lim — +> ge(xz;) = Pi(E) 
n—eo N j 

3 See, for example, Hahn, VII, p. 433. 

4 This theorem is known as the strong law of large numbers. For a proof see VIII, p. 37, 
or V, p. 764. 

15 Doob (IV, p. 365) proved this for the case of real-valued chance variables. The proof 
is the same as for this case. This theorem (which we shall obtain later as a special case of 
Theorem 10 of this paper) is what corresponds in the classical theory to the von Mises 
“Regellosigkeit”’ principle. See XI, p. 14. 

16 See von Mises, XI. 

17 See, for example, Copeland, II and III, and Wald, XIII. 








466 PAUL R. HALMOS 


for every w «Q — Z, every E ¢ 3; (gE(x) denoting the characteristic function of E), 
and all n = 1, 2,--- 


Proof. Let E,, Ez, --- be the total collection of sets in §,. By Theorem 4 
lim s ge,(xj) = Pi(E;) 
N-o j=l 
except on a set A; of measure zero (i = 1,2,---). WriteA = Ai +Az+---, 


and C, = 7,'(A). Since A is of measure zero and each 7’, is measure preserving 
(Theorem 5), C, is also of measure zero. Finally 7’, may fail to be defined on a 
set D, of measure zero. WriteZ = A+(Ci:+C.+---)+(Di+De.+---); 
then P(Z) = 0. Now take w to be any point «2 — Z, E to be any set in 9 
and n to be any positive integer. There exist two sequences of sets O, and I; 


such that I, E O; , I, « iy» O?; € Ry (k = ; a 3. see ), and lim P, (x) = 
ka 

lim P,(O,) = P,(E). The sets J, and O, are among the sets E,, E2, ---, 

ko 

hence we have for all k 


lim — >> g1,(zj) = PiU.) and lim ve +> go,(z;) = P,(O;,), 


N-@ N j=l No 
since w was taken out of Z D Ai + Az +---. Also 7,(@) is not in 
A, + A, +---, for otherwise w would be contained in some C, , and this 


contradicts the assumption that we — Z. Hence the last written relations 
hold when z; is replaced by 2; ; thus 

lim — & » g:(z;) = PiUy) and lim e go,(z;) = Pi(Ox). 

N-@ N j=1 N-@ N j=l 
Since I, G E | O;, we have — ge(x) S go,(x) for allzeQ,. Hence 

- 1 N m 
> > g:,(z}) $ ® » gx(2}) S yD go.l2}); 
‘= 


N j=1 


lA 


whence 


P(I;,) = lim ne S > g:,(z;) S S lim inf — > gu(z;) 


< lim sup V7 + > ge(xz;) S lim = +E go,(z7) = P,(O)). 


N--o@ 
Since this is true for all k, we have 
P,(E) s lim inf S lim sup S P,(£), 


so that 


lim inf = lim sup = lim ve Ss ge(z;) = P,(E).” 


18 If we exclude trivial cases by insisting that P; have at least two different positive 
values, the set of points w for which the conclusion of the theorem holds always has the 
power of the continuum. 














r in 





this 
ions 


ice 


(Ox). 


sitive 
as the 





INVARIANTS OF CERTAIN STOCHASTIC TRANSFORMATIONS 467 


3. The von Mises definition of independence. The axioms and definitions 
of the theory of probability as expounded by von Mises have been shown to 
correspond to theorems in the classical theory. Thus, for example, the various 
formulations of the law of large numbers correspond to the frequency definition 
of probability, and Theorem 5 of this paper corresponds to the ‘‘Regellosigkeit”’ 
principle. We shall now prove a theorem that expresses the fundamental idea 
in the von Mises definition of independence. There is a close analogy between 
this theorem and Theorem 5, and the method of proof here employed is similar 
to that of Doob.” 

Let 2 be any stochastic process and ® an independent and stationary 
stochastic process. Let @ be the space of all sequences w = {(11, ys), 
(te, Ye), --- }, Where {y, ye,--- }€Q and {x, ze,--- } €Q’. A unique 
probability measure is defined on 2° by the conditions 


P{\ (a, € By) --- (tn € En)(yr € Fi) --- (Yn € Fr)} 
= P(1,;¢€E;) «+» P(t, €E,)P{(y: € Fy) «+ (yn €Fo)}. 


THEOREM 7. Let E,, E., --- be an infinite sequence of measurable sets in the 
range space of 2 such that the probability is one that an infinite number of the 
conditions (y,¢E,) are satisfied. Let an(w) be the n-th subscript such that 
Ye, = E.,. To every point w of 2 make correspond the point w' €Q, w' = 
(ri, %2,-°-: }, where z, = La, The transformation T so defined is defined 
almost everywhere on Q taking values in Q’ and is measure preserving in the sense 
that if A’ is any measurable set in 2’ and A” = T~'(A’) is the total set of points 
of 2 whose images are in A’, then A° is measurable and P(A*®) = P(A’). 


Proof. It is sufficient to prove the theorem for sets A’ of the form 


, , , , 
(1) A’ = {7,€H,,---,2,€E,}, 
y/ , ° 
where FE, , --- , EZ, are measurable sets in the range space of 2’. We proceed 
by induction. 
, +f , F 
Let n = 1; A’ = {2,€E,}. We have, excepting always the set of measure 


zero where 7’ may not be defined, 
(2) A? = f(y € Ey)(21 € Et)} + | (y: € CEi)(y2 € Bx)(z2 € Ei)} + ---. 


Hence, since the sets in this sum are disjunct we have, by the definition of 
measure on ’, 


(3) P(A?) = P(x, ¢ Ei)P(y: € Ei) + P(xe ¢ E:)P{(y: € CE:)(y2 € E2)} + ---. 


The last written expression is the product of P(z, ¢ £;) by the probability that 
at least one of the conditions (y, ¢ E,) is satisfied, hence P(A*®) = P(r, € E;) = 
P(A’). 

This proves the theorem for n = 1. Assume that it is trueforn — 1. Write 


TV. 











468 PAUL R. HALMOS 


A’ = {aie Ei, ---,2,€E,}, & = {7 "(A)}. In order to prove that A’ is 
measurable consider the functions r;(w’) = a,(w'). 
[ta,(w) « E} = la; = jilaje BE} + lap = ft Ulrmre HB} +---. 


The sets {z, ¢#} are measurable, if HE is any measurable set in the range 


space of 2’. The set {a; = m} is the set where the j-th y-coérdinate of w° 
which belongs to its EZ; is ym ; this set is the sum of sets where y,, is in E,, and 
of the preceding y’s precisely 7 — 1 are in their Z;. These summands, and 


° , . 
hence their sum, and therefore, finally, the set {z; « H} are measurable. Hence 
. . . , , . 
A, being the intersection of the n sets {x; « E;}, is measurable. 
, af ° la hl 
Denote by A» the set {z; « Hj (j = 1,---,n — 1)}. Then 


«o 
(4) A? = D {we Ao, an = j, tj€ Es}. 

j=n 
Consider any summand in (4): {w° € Ao, dn = j,2;€E,}. It is readily verified 
that the set {w’ € Ao, @, = j} is a cylinder set over the first 7 — 1 (z, y)-codrdi- 
nates of w. Hence 


(5) Pw € Ao, Gn = j, 2; € En) = Plw’ € Ao, Gn = j)P(z; € E,). 


From (4) and (5), the stationary character of 2’, and the disjunct nature of the 
summands in (4) we obtain 


«2 
(6) P(A) = P(x, € E.) > Pw’ € Ao, a, = 5). 

j=n 
The last written summation is the measure of the intersection of A, with the 
set where at least one of the conditions (y, ¢ Z,) is satisfied for n > j: the 
measure of the latter set is one; hence 


(7) P(A) = P(x, € E,,)-P(Ao). 


The theorem follows immediately from the induction hypothesis. 


4. The invariance of expectation. Theorem 5 asserts that under certain 
hypotheses all probability relations are invariant under a system transforma- 
tion. In this section we make less restrictive hypotheses on the stochastic 
process to obtain the conclusion (Theorem 8) that the “fairness” of a gambling 
game of which the stochastic process is a mathematical description is invariant 
under a system—where the criterion for fairness is expressed, as usual, in terms 
of the vanishing of certain expectations. 

DEFINITION 22. Let @ be a stochastic process associated with a probability 
space consisting of real numbers; let n be a positive integer; and suppose that 
x,;(w) is summable for 7 = 1, 2, ---. Q(E) = [ ealo) dP, E ¢ Bi,....0n-—; 

E 
is a finite, completely additive set function on Qj,...,¢.-1) that vanishes when- 
ever P(E) = 0. Hence there exists a summable function on ©, uniquely deter- 











mos 






van 


2 











ed 


he 


che 
the 


ain 
na- 
stic 
ling 
ant 
rms 


lity 
that 
s—1) 5 


hen- 
oter- 











INVARIANTS OF CERTAIN STOCHASTIC TRANSFORMATIONS 469 


mined except for a set of measure zero and depending on the codrdinates 
Ti, °-:,2n-1 Only, say E(x, +++, 2n-1j Yn), such that 


[ 0) aP = [ ee, °**, Dna; Za) GP 
zg E 


for every set FE ¢ By,....¢n-y. E(ai, +++ , tn-1 j Xn) is the conditional expectation 
of x, for given m1, --+ , 24. 


THEOREM 8. Let Q be a real-valued stochastic process in which the functions 
r,(w) are uniformly bounded. Let T be a system transformation on Q taking 


° , , ° 
w = }%1,2%2,--- | intow’ = {4,,22,--- }. If E(t, --- , Ln-1 3 2n) vanishes 
y , , , ° 
for all n and almost all x, --- ,2n4+, then E(x, --- ,2%n-1j; Xn) vanishes for 
, / 
all n and almost all x; , --- ,2n-a. 


Proof. Our hypothesis is that 
(8) / E(a, °°: ,%-1;%,)4P = / z,adP =0 
M uM 


for all n and all measurable cylinder sets M over x,,---,2,-1. Weare to 
prove that 


ve , , , 
(9) / B(zi, +++, thasx)dP = [ r,dP =0 
i mM’ 
. , , . 
for all » and all measurable cylinder sets M’ over x,,---,2%,-1. It is suffi- 
cient to prove this result for sets M’ of the form 
/ / 
M’ = {x €E,, --- , 2,-1€E,-1}, 
’ , ° 20 
where E, , --- , E,-, are Borel sets on the real line.” 


We start then with M’ and derive an expression for P(M’{xi, ¢ E,}), where 
E, is a Borel set on the real line. We have 


/ al J al 
[In € EL} = jan = m}{tme EF} + fam = m+ 1} ftmii€ FE} + --- 
(10) x 
4 7 
= y # {am = m + 3} {2ms;€ E}. 
7=0 
Hence 
, . , - , » , , . 
M'\x, € E,} _ } ry € By} | 22 € E2} sia (Ln 1 € E,-1} tm € E,} 
a 2 
, . , . 
= (>> fa =fit Ufajare Ei}) --- (OS fan = jn +n} fajon € En}) 
7,;=0 jn=0 
(11) 
oe x 
=D DY la =ftlj e+: fan =p. + n} fz;,41€ Ei} 
in=t 7:=0 
(x, an € En}?! 
*0 Every measurable set over z,, ---, 2,—, can be obtained from sets such as M’ by at 
. 1 1 


most a denumerable sequence of sum, product, and complement processes. If the integral 
vanishes for all sets such as MW’, it will vanish for sums of such sets, etc. 
*l ai(w), d2(w), --+ is the subscript sequence associated with 7, as in Definition 15 











470 PAUL R. HALMOS 


The last expression follows from the ordinary algebra of point sets. We make 

certain easily verifiable remarks about the last written sum. 

(12) The sets being added are disjunct in pairs. 

(13) Since, for example, a2 is always greater than a,, the set ja; = ji + 1} 

{a2 = jo + 2} is empty if 7: + 1 = je + 2; i.e., it is empty when j; > je. 
In virtue of (13) we have from (11) 


jn 


M'{z,¢E,} = Xu 7 2 fa. =ft lj 


(14) 
iba. {dn = Jn + n} {rj us E;} a? {Zj,+n € E,}. 


From this we have, in virtue of (12), 


Pitts. BI) = SS -:- 2X Pola ase 
in=0 ae 


(15) 
e+ fan = jn + nj {ajar€ Bij --- {Zj,+n € E,}). 


Let us write 
M;,,...;, = {1 = jr H 1h +++ fan = gn + mh le j41 € Bi} (2j,-,4n4 € Bn}. 


It is immediately verifiable that the set M;,...;, is a cylinder set over the coér- 
dinates 2, +--,2j,:.-1- Write also 


jas jz 
M;, = > o> Se. 
in-1=0 ii=0 
Since we already saw that the summands are disjunct sets, we have 
" i2 
(16) P(M;) = 2 --- S PUMiy i: 
in-1=0 i1;=0 


Since the x, were assumed to be uniformly bounded, the z, will be uniformly 
bounded, and therefore summable. 


(17) [ z,dP =lim >> réP(M' {rs < x, < (r + 1)8}). 
, 6-0 r=—o 
Then, from (15) 
(18) / z.dP =lim >> 18 > P(M;,.{r8 S 2.4. < (r + 198}). 
M’ 6-0 r=—o in=0 


We may now define the function y(w) to be equal to z;,,, on Mj, (jn = 
0, 1, 2, --- ), and to be zero elsewhere. (The sets M;, are disjunct sets. The 
function y(w) really depends on n, but since n is being held fixed in this dis- 
cussion we do not indicate this dependence.) Then 


22 See (ii), Definition 14. 








the 





ke 


oor- 





rmly 


ji. = 
The 
; dis- 








INVARIANTS OF CERTAIN STOCHASTIC TRANSFORMATIONS 471 


oe 


/ x.dP =lim >> rs > P(M;,, {ré S y < (r + 1)8}) 
mu’ 6-0 r=—o@ jn=0 
=lim > r6P (> M,,{ré S y < (r + 18}) 
6-0 r=—o jn=0 
- / . giPe= ae yiP = a tx dP 
pA Mj, jn=0 jn=0 YM 
jn=0 
( = = 
sd = > lim Dd rsP(M;,{r8 S tian < (r + 108}) 
in=0 5-0 r=—oo 
= Dlim D vi .- > P(M;j,... S Xj,4n < (r + 1)8}) 
in=0 6-0 r=—o in-1™= j)=0 
=> o , . . lim >> r6P(M;,,...;, {76 S X,+n < (r + 18}) 
in=0 jn—1=0 71:=0 6-0 r=—o@ 
_ > 7s  . ‘- 2j,4+n4P = 0. 
in=0 jn—1=0 ji=0 SM; 


This concludes the proof of Theorem 8. 

We remark on the analogy between this theorem and Theorem 5. Here we 
proved, essentially, that if E(z;,--- ,2n-1; 2%n) has a constant value inde- 
pendent of n and 71, --- , 2,1, then E(2;, «++ ,2n-1; 2») Will also have that 
constant value independent of n and z;, --- , 2-1. Theorem 5, on the other 
hand, may be phrased as follows. If P(a, --- ,2n-1; 2n € #) has a constant 
value depending only on E, but not on nora, «++ , tn-1, then P(z;, +++ , tn; 
a ¢ E) will also have that constant value dependent only on E£, but not on n 
or 21, eee » Zant 


5. The invariance of asymptotic independence. 

DeFINITION 23. The stochastic process Q2 is uniformly asymptotically inde- 
pendent if there exists a probability measure F on its range space such that 
P(x, , +++ ,%n-1; Xn € EZ) converges uniformly in w (but not necessarily in E) 
almost everywhere to F(Z), where E is any measurable set. 

The first purpose of this section is to prove the following theorem. 


THEOREM 9. The property of uniform asymptotic independence is invariant 
under every system transformation. 


Instead of proving this theorem we shall state and prove the following 
slightly stronger one. 


THEeorREM 10. Jf for some « > 0, measurable set E, and positive integer no , 
P(t, --- ,2%n1; %n€E) — F(E)| < € for all n = m almost everywhere, then 
P(x; , +--+ ,2a-1 320 € EB) — P(E) | < efor all n = m almost everywhere (where 


the x', are obtained from the x, by a system transformation). 





472 PAUL R. HALMOS 


The important difference between this theorem and the preceding one is that 
here we definitely state that the ¢ and mo after the system transformation are 
the same as before. 

Proof. et I’... be any measurable cylinder set over x; , ---, 2-1. Then 


i P(xi, +++, 2.apa.eE) — F(E)dP =| PU ifx.e E}) — PU) F(B)| 
I 


D> PU -sfan =n +f} hansje E}) — F(E)PU4) 


j7=0 
(20) = , —_ 
<2) Pla, +++, Enis; Pn4j € BE) — F(E)| dP 
i=0 47, -1(@n=n+i} 
x 
< | edP - | edP. 
j=0 47, —1(@n=n+i} | 
“* . . “2 . , , , 
Since this is true uniformly in [,-1, we have | P(m,--- ,2n-1; 


x, €E) — F(E)| < efor n = m almost everywhere. This concludes the proof 
of Theorem 10, and therefore of Theorem 9. 

The reason for stating Theorem 10 in its present form is that in this form 
Theorem 5 is easily seen to follow from it. For, according to the hypotheses of 
Theorem 5, the hypotheses of Theorem 10 are satisfied with mo = 1 and every 
¢>0. The conclusion of Theorem 10 then assures us that after any system 
transformation the conditions are still satisfied, with mo = 1 and every e > 0. 
This implies that the transformed process is independent, stationary, and has 
the same distributions as the original stochastic process. 

Besides the situation just mentioned, there are many other important ex- 
amples of uniformly asymptotically independent stochastic processes. It is 
clear, however, that uniform asymptotic independence is a very strong condi- 
tion. There may be stochastic processes which from a practical point of view 
are independent in the long run, without being uniformly asymptotically inde- 
pendent in the sense of Definition 23. In order to investigate such processes 
we make the following definitions. 

DerFInitI0on 24. The stochastic process 2 is asymptotically independent in 
probability if there exists a probability measure F on its range space such that 
P(t, -+-,2n-1; Zn €E) converges in probability to F(#), where E is any 
measurable set. (That is: to every positive number ¢ and measurable set E 
there corresponds a positive integer m such that P{) Pla, --+ ,&n-1; 
z,¢€E) — F(E)| > e} < « forn > nm.) 

DerinitTion 25. The stochastic process 2 is «multiplicative if there exists a 
probability measure F on its range space such that to every positive number ¢€ 
there corresponds a positive integer no such that 


| P(Ajane#}) — P(A)F(E) | < « forn > m, 


where E is a preassigned measurable set, uniformly in the measurable cylinder 


set A over %1,-°-+,2Xn-1. 











Sil 
the 


Th 


whe 
n SU 


(sine 
The 

W 
inde 








iat 
ure 


en 


E)| 


dP 


t is 
ndi- 
iew 
ide- 


sses 


t an 
that 


nder 











INVARIANTS OF CERTAIN STOCHASTIC TRANSFORMATIONS 473 


Before investigating the invariance under system transformations of asymp- 
totic independence in probability, we need the following auxiliary theorem. 


THEOREM 11. A necessary and sufficient condition that a stochastic process be 
asymptotically independent in probability is that it be multiplicative. 


Proof. We use the notation of Definitions 24 and 25. We first prove that 
if the process is ¢multiplicative, then it is asymptotically independent in 
probability. Suppose that the conclusion of the theorem is false: then there 
exists a positive number e such that 


(21) {| Pla, --+,2n1+;2%n€E) — F(E)| > e} 


€ 


IV 


for an infinite number of values of n. This implies that either there is an 
infinite number of values of n for which the difference is > ¢ and positive on a 
set of measure 2 ¢, or else that there is an infinite number of values of n for 
which the difference is negative and < — ¢ ona set of measure = «. In either 
case there is an infinite number of values of n corresponding to which we may 
find a measurable cylinder set A, over 2, --- , Z,-1 such that P(A,) = ¢ and 
such that the difference in (21) is of one sign on all A, and is greater in absolute 
value than e. For all these values of n we have 


|P(An{tn¢€E}) — P(A,)F(E)| = / P(a, -+-,%n-1; tn €H) — F(E)dP 
An 


= | | Pla, ---,2n132,€E) — F(E)|dP = ¢P(A,) 2 é. 
An 

Since this contradicts the assumption of emultiplicativity, the sufficiency of 
the condition is proved. 


Assume now that the process is asymptotically independent in probability. 
Then 


P(A{z, €E}) — P(A)F(E)| = i P(xi, «++, te-1; Zn € EB) — F(E)AP 
A 


IIA 


| P(x, -++, Ent; tne E) <i F(E) |dP 


A 
/ + [ |--- a, 
At Ae 
where A, is the part of A on which the integrand is S$ ¢, and A‘ = A — A,. For 


n sufficiently large we have, using asymptotic independence in probability, 


| P(A{z, « £}) — P(A)F(£) | S ¢€ P(A.) + 2P(A‘) S ¢ + 2e 


II 


(since | P(x, , --- , 2n-1; 2x € E) — F(E)| S 2). This concludes the proof of 
Theorem 11. 

We are now able to begin the examination of the behavior of asymptotic 
independence in probability under system transformations. 





474 PAUL R. HALMOS 


THEoREM 12. Let Q be a stochastic process and let F be a probability measure 
on its range space such that P(x, --- , 2n-1 ; tn € E) converges almost everywhere 
to F(E), where E is any measurable set. Then, if T is any system transformation 
on Q, the transformed process T(Q) is asymptotically independent in probability. 


We note that the conditions put on 2 and those that 7'(Q) is proved to satisfy 
are not the same. The process 2 is asymptotically independent in probability, 
and a little more: the conditional probability distributions are assumed not 
merely to converge in probability but to converge almost everywhere; but the 
process 7'(Q2) is merely asserted to be asymptotically independent in proba- 
bility. That this lack of symmetry is in a certain sense in the nature of things 
will be shown later. 

Proof. According to Theorem 11 it is sufficient to prove that T(Q) is «-multi- 
plicative; that is, it is sufficient to show that, for every e, | P(A’ {x,, € E}) - 
P(A’)F(E) | <«¢ for n sufficiently large, uniformly in the measurable cylinder 


set A’ over th, rey Tn-1 . We have, using our usual notation for system trans- 
formations, 
(22) P(A'{z), € E}) = D> P(a'{a, = n + j} {t01; € E}). 
7=0 
Hence 


| P(A’{2, €E}) — P(A’)F(E) | 
1D PlA'fan =n + 5) {anrje B}) — P(A’fan = n + 5})F(B) 


j=0 


> | P(x, +++, Dnti-1; In4je EB) — F(E) dP | 
j=0 YA'(a,=—n+)} | 


(23) 


lA 


> | / | P(ai, +++, tn4j-13 In4j;¢ ZH) — F(E)|dP. 
=0 {a,=n"+/} 


Since P(x, , --+ ,2n-1;2,€E) converges to F(E), it converges, by Egoroff’s 
theorem,” uniformly on a set D, with P(D) arbitrarily close to one. Hence 
for given « > 0 we may select D so that P(CD) = 1 — P(D) < «. On Dwe 


have 

(24) | P(ai, +++ , 2n¢j—1 5 2n4p €E) — F(E)| < 

for n sufficiently large and 7 = 0,1, 2,---. Hence, from (23) 
| Pla'tze « E}) — P(a’)F(E) | 


Ei | +f ap Pty = 5 Hntinn Busse B) — F(B)|aP 
A’ {@,=n+j)D Gn=n+) 


> dP +> [ 2aP 
j=0 YA’ (a,=—n+j|D j=0 YA’ (a,=—n+j}CD 
eP(A’D) + 2P(A’-CD) S ¢€ + 2, 


as was to be proved. 


o 


(25) 


lA 


23 Saks, XII, p. 18. 

















inder 
rans- 


*(E)| 





roft’s 
Tence 
D we 


1) | dP 





INVARIANTS OF CERTAIN STOCHASTIC TRANSFORMATIONS 475 


THEOREM 13. Under the hypotheses of Theorem 12, to every positive integer k, 
measurable sets E, , --- , E, , and positive number « there corresponds a positive 
integer mo such that 

| P(A’ {arias € Ei} --- {aise e Ee}) — F(E:) --- F(B:)P(A’)| < « 


, 


or all n > no, uniformly in the measurable cylinder set A’ over x1, +--+, 2. 
fe ’ y cy . 


This slightly more general theorem follows readily from Theorem 12 by 
mathematical induction. 

In order to show that without some extra hypothesis (such as convergence 
in Theorem 12) we may not assert asymptotic independence in probability 
about 7(2), we construct an example of a stochastic process which is asymptot- 
ically independent in probability, but which loses this property under a suitable 
system transformation. 

Take 0 < q < p <1. Let & be a stochastic process in which each z, takes 
only the values 0 and 1, and in which the following conditions are satisfied. 


For 2” < n Ss 2” — 1, P(a,---,2n-13 2n = 0) depends only on the co- 
ordinates 2, , --- ,Zm(m = 0,1, 2,---). For each positive integer m arrange 
the 2” possible sets of values of (2: , --- , 2m) in some order: say, for definiteness, 


that these sets are ordered according to the magnitude of the dyadic fractions 
Dy%2-+: Im. Let P(x, = 0) = p. Let P(x, --- , Lem_s ; Lem = 0) be p when 
(z;, --+ , 2m) is the first set in this ordering and let P(a,, --- , Zem_1 ; em = 0) 
be q otherwise. Let P(x, --- ,2em ; Yem41 = 0) be p when (%1,---,2m) is 
the second set and q otherwise; and so on, for each n between 2” and 2”** — 1, 
and every m = 0, 1, 2,---. According to a theorem of Doob’s™ these condi- 
tions are consistent. That is, if we define P(z,---,2%n-1; Zn = 1) to be 
1 — P(x, +--+, 2%n-1} 2%, = 0), then there is a probability measure on 2 which 
has precisely these functions for its conditional probability functions. 

We have to show that this process is asymptotically independent in proba- 
bility. Let ¢ be any positive number, 0 < « < p— gq. The probability measure 


of the set where | P(z, --- ,2n-1; In = 0) — G| > € is the measure of the 
set where P(z,,---, 2n-1; 22 = 0) = p. For2” Sn 2”** — 1, this set 
is of the form {z; = 21, --- , 2m = 2m}. Weare to prove then that the measure 
of such a set approaches zero with n'. But 
P(z, = nh, ite > Xm = 7) = P(x = zi, — » Em-1 = Tmt) 

-P(zi, +++, 20-45 tm = 2%). 
The last-written conditional probability is not greater than r = max {p, q, 
(1 — p), (1 — q)}; whence P(z; = 2}, ---,2m = 2m) Sr”. Hence P(n,---, 


Zn1; 2, = 0) converges in measure to q. 

We now define a system transformation 7 by defining the sequence of choice 
functions _— Let fa(ai, --+ ,2n-1) be 1 or O according as P(x, --- , Zn; 
t,= 0) = porg. It is readily verified that this sequence of functions satisfies 
all the conditions of Definition 14. 

“VI, §4. 

* Definition 14. 





476 PAUL R. HALMOS 


J 


For 2” < n S 2”' — 1, f, depends on the codrdinates 2, --- , 2m only 
(m = 0,1,2,---). Thusf; = 1; fe = lif 2, = 0 and f. = 0 otherwise; f; = 0 
if z, = 0 and f; = 1 otherwise. Hence we see that a2 takes only the values 2 
and 3 and takes these values on the sets {P(x ; z2 = 0) = p} and {P(m, x; 
a; = 0) = p}, respectively. It is similarly verified that for all n, 2” < a, 
< 2" — 1, and that the set where a, takes one of its values, say {a, = k}, co- 
incides with the set {P(x, --- , Ze-1; Xe = 0) = p}. 

It is now easy to see that T(Q) is not asymptotically independent in proba- 
bility. For if it were, P(x’, = 0) would certainly converge to g. But 


P(x, = 0) —q = LD Pl{an = n + j} (ans; = 0}) — Plan =n +J)q 


j=0 


-2/ P(x, +++, n4j-13 Tn4j = 0) — qdP 
{an=n+)} 


7=0 
2"-1 


-_ } [ P(m, +++, %j-1; 2; = 0) — qdP 
j=2"-1 Jla,=j} 


=| nt. te (p —q)dP =p —@ 


{ayn=2"—1} 


6. Asymptotic expectation theorems. In this section we shall prove the 
theorems that stand in the same relation to the theorems of the preceding 
section as Theorem 8 stands to Theorem 5. 

TueoreM 14. If 2 is a real-valued stochastic process such that the functions 
In(w) are uniformly bounded and such that | E(x, --- ,2n-1j tn) | < € for all 
n = mm almost everywhere, then | E(x; , ---,2n-13 Zn) | < €for all n = no almost 
everywhere (where the x, are obtained from the x, by a system transformation). 


, . , / 
Proof. Let I,, be any measurable cylinder set over 2;,---,2%n,-1. Then 


[. E(xi, +++, @n-13%n) dP =| a, dP | 
In-1 


i %In-1 | 
(26) = lim D> réP(iufré < 2, < (r + 18}) 
6-0 r=—a 
= lim >> ra + x P(U-afan =n + j}{ré S tas; < (r + 1)8})). 
6-0 r=—o 7=0 


7 . oge ° , ‘ 
We now define the auxiliary function y to be z,.; on J,1{a, = n + j} and zero 
elsewhere. Then 


| > , , 
if E(2,, +++, 2%n-13 22) dP 
In-1 


oe oo 


= lim > rs D> PU alfa, = n+} {re 


6-0 r=—a j=0 


IIA 


y < (r + 1)8}) 














‘hen 


zero 





INVARIANTS OF CERTAIN STOCHASTIC TRANSFORMATIONS 477 
oo 
‘ |. ydP| = +o tej4P 
Z In—1{an—n+3} j=0 YIn-1(a,—n+/} 
j=0 
oc 
+ > / | E(x, +++, 2n4j-13 Ln4;) | AP 
=0 47,—1{a,—n+)} 
oe 
<> / aP =| edP for n 2m. 
j=0 YIn-1{a,=n+4} In-1 
. . . . . , , , , 
Since this is true uniformly in J,_;, we have | E(x, ---,2n-1; 2n €E)| < «, 


for all n = no, almost everywhere. This concludes the proof of Theorem 14. 
We note that the proof of this theorem remains unaltered if we remove the 
absolute value signs from its statement. This shows, for example, that if the 
conditional expectations are all negative, they will remain negative after any 
system transformation. 
We note also that this theorem stands in the same relation to Theorem 8 as 


Theorem 10 to Theorem 5. 

THEOREM 15. Jf Q is a real-valued stochastic process such that the functions 
t,(w) are uniformly bounded and such that E(x, --- , 2n-1 ; Ln) converges to zero 

an # , , P 

almost everywhere as n — «, then E(x,, --- ,£n-13 Xn) converges to zero in proba- 

° , ° . 
bility (where the x, are obtained from the x, by a system transformation). 

. o.8 rm , , 

Proof. A necessary and sufficient condition that F(a, --- ,2%n-1; 22) con- 

verge to zero in probability is that to every positive number e there correspond 


2 . , 
a positive integer % , such that for n > n [ x,dP, < « whatever the meas- 


A 
, , , tan re , , ’ 
urable set A’ over 21, ---,2,-1maybe. (By definition [ E(21, ---,2%.-1:;2,)0P = 
A’ 
; y 
| z, dP.) We have now 
A’ 


oc 


/ x, dP lim >> réP(A’{ré S x, < (r + 1)8}) 
, 6-0 r=—a 


II 


(27) 
= lim >> r6 >> P(A’ {an = n+} {76 S tna; < (r + 1)8}) . 
6-0 r=—o 7=0 
We define the function y to be x4; on A’{a, = n + j} and zero elsewhere. 
Then 
/ z,dP = lim > ré L * P(A’ {a, = n+} {rs Sy < (r + 18}) 
a’ i~0 ra jet 
(28) -\/. => >| asada 
Zz A’ {a,=n+j} j=0 YA’ (a,—n+j} 


IIA 


wo 
; Po / Bla, «+, Sarat ted ah 
j=0 4A’ \a,=—n+)} 





478 PAUL R. HALMOS 


By hypothesis E(x, , --- , 2n-1 ; Zn) converges to zero almost everywhere. We 
apply Egoroff’s theorem. Given any positive number e, there exists a set D 
such that P(CD) = 1 — P(D) < eand such that on D we have 


(29) | Bai, +++ ,2n¢j-15 Mn¢i) | < 


for n sufficiently large and 7 = 0, 1, 2,---. Then 


| x 
, 
z,4dP\s at + | | E(ai, +++, Un4j-13 Lnaj) |AP 
A‘ {a,=n+7|D A‘ {@,=n+j|CD 


A’ j=0 


> / dP + [ KaP, 
j=0 4A‘ (a,—n+)} A’(a,=n+j}CD 


where K is the common upper bound of the functions z,. (If |z.| < Kj 
| E(zi, +--+ ,2n-1;2n)| S K.) Hence, finally, 


(30) 


(31) / x.dP| < ¢P(A\’D) + KP(A’-CD) < ¢ + Ke. 
yy 


This concludes the proof of Theorem 15. 

We note that the example of the preceding section shows that the convergencé 
in probability of E(2,,---,2n-1; 2.) is not invariant under every system 
transformation. 


BIBLIOGRAPHY 


I. Z. W. Birnpaum anp J. Scurerer, Eine Bemerkung zum starken Gesetz der Grosse 
Zahlen, Studia Math., vol. 4(1933), pp. 85-89. 

II. A. H. Copetann, Admissible numbers in the theory of geometrical probability, Am. 
Jour. of Math., vol. 53(1931), pp. 153-162. 

Ill. A. H. Copetann, Admissible numbers in the theory of probability, Am. Jour. 
Math., vol. 50(1928), pp. 535-552. 

IV. J. L. Doos, Note on probability, Annals of Math., vol. 37(1936), pp. 363-367. 

V. J. L. Doos, Probability and statistics, Trans. Amer. Math. Soc., vol. 36(1934), 
pp. 759-775. 

VI. J. L. Doon, Stochastic processes with an integral-valued parameter, Trans. Amer 

Math. Soc., vol. 44(1938), pp. 87-150. 
VII. H. Haun, Uber die Multiplikation total-additiver Mengenfunktionen, Annali della 
R. Se. Norm. Sup. di Pisa, vol. 2(1933), pp. 429-452. 
VIII. E. Horr, On causality, statistics, and probability, Jour. of Math. and Phys., vol. 
(1934), pp. 51-102. 

IX. H. Hunremann, Uber den mathematischen Kern des Prinzips vom ausgeschlosse 
Spielsystem und eine darauf gegriindete Wahrscheinlichkeitstheorie, Deutsel 
Mathematik, vol. 2(1937), pp. 593-622. 

X. A. Kotmocororr, Grundbegriffe der Wahrscheinlichkeitsrechnung, Berlin, 1938. 

XI. R. von Mises, Wahrscheinlichkeitsrechnung, Wien, 1931. 

XII. 8. Saks, Theory of the Integral, Warsaw, 1937. 
XIII. A. Wap, Die Widerspruchfreiheit der Kollektivbegriffes der Wahrscheinlichkeit 
rechnung, Ergebnisse eines math. Kolloq., vol. 8(1936), pp. 38-72. 


UNIVERSITY OF ILLINOIS. 











PERIODICALS PUBLISHED BY DUKE UNIVERSITY 


American Literature. A quarterly journal devoted to research in American 
Literature, published with the codperation of the American Literature 
Group of the Modern Language Association of America. Subscription, 
$4.00 per year. Back volumes, $5.00 each. 


Character and Personality. A psychological quarterly devoted to studies of 
behavior and personality. Subscription, $2.00 per year. The first 
number was published September, 1932. 


Contributions to Psychological Theory. A monograph series dealing with prob- 
lems of psychological theory in the widest sense, including their rela- 
tions to other fields of inquiry. The monographs appear irregularly. 
Subscription, $5.00 per volume of approximately 450 pages. 


Duke Mathematical Journal. 


Ecological Monographs. A quarterly journal devoted to the publication of 
original researches of ecological interest from the entire field of biological 
science. Subscription, $6.00 per year. The first number was published 


January, 1931. 


Hispanic American Historical Review. A quarterly review dealing with the 
history of the Latin-American countries. Subscription, $4.00 per year. 


The Journal of Parapsychology. A scientific quarterly dealing with telepathy, 
clairvoyance, and other parapsychological problems. Subscription, $3.00 


per year. 


lew and Contemporary Problems. A quarterly published by the School of 
Law, presenting in each issue a symposium on a problem of current im- 
portance having significant legal aspects. Subscription, $2.00 per year. 
The first number was published September, 1933. 


* The South Atlantic Quarterly. A magazine of modern opinion and discussion, 
founded in 1902. Subscription, $3.00 per year. 


_ The Southern Association Quarterly. As official organ of the Southern 
& Association of Colleges and Secondary Schools, it contains the proceedings 
of the annual meeting, together with much additional material directly 
related to the work of the Association. Subscription, $4.00 per year. 


DUKE UNIVERSITY PRESS 
DURHAM, NORTH CAROLINA 





CONTENTS 


Surfaces of negative curvature and permanent regional transitivity. 
By Anna Grant 2 

The measure of geodesic types on surfaces of negative curvature. 

By Gustav A. HepLunp 
A proof that every uniformly convex space is reflexive. By B. J. Perris... . 
Differentiation in Banach spaces. By B. J. Perris 
Non-commutative arithmetic. By Ropert Di.wortTa 
Asymptotic forms for a general class of hypergeometric functions with 

applications to the generalized Legendre functions. 


By Georce E. ALBertT 28) 


Preservation of partial limits in multiple sequence transformations. 
By Hues J. Hamiiton 


Convergence theorems for continued fractions. By Water LEIGHTON... 


Generalized problem of Bolza in the calculus of variations. : 
By M. R. Hesrenzs 3 


A generalized Lambert series. By J. M. Dossre.. 


Persymmetric and Jacobi determinant expressions ie “ctiesaak elle 
nomials. By VrviAN Eperte SPENCER 
The algebra of lattice functions. By Morcan Warp 


Modular fields I: Separating transcendence bases. 
By Saunpers Mac Lang 


Oscillating functions. By R. P. Boas, Jr 

A differential equation for orthogonal polynomials. By J. SHouat 

On Bernoulli’s numbers and Fermat’s last theorem (second paper). 

By H. S. Vanpiver 4 

Note on topological mappings. By J. H. Roserts 

Contributions to the theory of groups of finite order. By Oysrern ORE... 

Invariants of certain stochastic transformations: The mathematical theory | 
of gambling systems. By Paut R. Hatmos | 








